[PATCH] D40987: Rewrite the cached map used for locating the most precise DIE among inlined subroutines for a given address.

Fri Dec 8 03:16:46 PST 2017

chandlerc added a comment.

Thanks for all the feedback!

================
Comment at: llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h:457
+
+  void buildSubprogramDIEAddrMap();
+  void buildInlinedSubroutineDIEAddrMap(SubprogramDIEAddrInfo &SPInfo);
----------------
aprantl wrote:
> I think it would be nice to copy the high level overview of the two-level approach outlined in the phabricator review into a comment somewhere here.
Yeah, added comments to both of these routines in the source file (since they're implementation functions) and also added the two-level descriptive comments to the implementation of getSubroutineForAddress as that seems to be the most cohesive place to show the pattern of access.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:358-359
+    if (!Die.isSubprogramDIE()) {
+      for (DWARFDie Child = Die.getFirstChild(); Child;
+           Child = Child.getSibling())
+        Worklist.push_back(Child);
----------------
dblaikie wrote:
>   for(DWARFDie Child : Die.children())
This is what I get for cargo culting. =]

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:371-382
+    SubprogramDIEAddrInfos.push_back({Die, (uint64_t)-1, {}});
     for (const auto &R : Die.getAddressRanges()) {
       // Ignore 0-sized ranges.
       if (R.LowPC == R.HighPC)
         continue;
-      auto B = AddrDieMap.upper_bound(R.LowPC);
-      if (B != AddrDieMap.begin() && R.LowPC < (--B)->second.first) {
-        // The range is a sub-range of existing ranges, we need to split the
-        // existing range.
-        if (R.HighPC < B->second.first)
-          AddrDieMap[R.HighPC] = B->second;
-        if (R.LowPC > B->first)
-          AddrDieMap[B->first].first = R.LowPC;
+
+      SubprogramDIEAddrMap.push_back({R.LowPC, DIEIndex});
----------------
dblaikie wrote:
> This does slightly change the claim in the patch description about only 2GB functions being a problem. Since this at least in theory supports fragmented functions, even a small function with, say split into a hot and cold section, could end up with the hot part 2GB of address away from the cold part perhaps more readily/likely than a 2GB function existing.
> 
> But still probably nothing to worry about.
Yeah, I've switched to at least use uint32_t so we have 4gb. I'll document it as well.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:384-388
+    // Also recurse into any children DIEs that are also subprogram DIEs.
+    for (DWARFDie Child = Die.getFirstChild(); Child;
+         Child = Child.getSibling())
+      if (Child.isSubprogramDIE())
+        Worklist.push_back(Child);
----------------
dblaikie wrote:
> subprograms that are children of other subprograms might be arbitrarily nested - not only direct children (looks like this code only handles direct children?)
> 
> (though there's no /good/ reason they would be nested inside an inlined subroutine)
> 
> eg: you could have code like:
> 
>   void f1() {
>     if (int x = ...) {
>       void f2() { ... }
>       ... 
>     }
>   }
> 
> in which case, you could have DWARF like:
> 
>   DW_TAG_subprogram
>     DW_AT_name "f1"
>     DW_TAG_lexical_block
>       DW_TAG_subprogram
>         DW_AT_name "f2"
> 
> which, yes, does mean you have to walk all the DIEs to find all the subprograms... not sure how much that defeats the goals here (I guess not having to build up so much in the way of data structures while doing so is still advantageous).
Ow, OK.

This does make things slower, but we still get to avoid decoding every inlined subroutine. We just walk the graph w/o reading the address ranges. So my patch still helps.

This change costs us about 10% total. =[

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:457
+    // If the start addresses are exactly the same, we can
+    // remove all but the first one as it is the only one that
+    // will be found and used.
----------------
dblaikie wrote:
> What's the "first" in this case - is it deterministically the most nested? Least nested? Unspecified?
> 
> We'd need to make sure it's the most nested.
We aggressively split as we insert so that we should end up with essentially a single (most nested) offset which has a valid DIE index.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:507-509
+      // Clamp the PCs to be above the base.
+      R.LowPC = std::max(R.LowPC, BasePC);
+      R.HighPC = std::max(R.HighPC, BasePC);
----------------
dblaikie wrote:
> This is just for bad input, I take it? (where the address range of an inlined subroutine is outside that of the enclosing subprogram)
Yes, I just wanted to bound how bad that got below when I re-base the offsets.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:535-536
+
+    // The PCs from this DIE should never overlap, so we can easily sort them
+    // here.
+    std::sort(AddrMap.begin() + DieIntervalsBeginIdx, AddrMap.end(),
----------------
dblaikie wrote:
> What's the behavior if they do overlap? While the DWARF spec & a reasonable person would say they can't, physically the data could contain such ranges. Will this crash? Have unreliable behavior? What's reasonable on invalid input here?
> 
> (it's probably fine, just checking it's articulated)
It should produces some unpredictable result, but never crash. Regardless of whether this holds, we can still sort and unique them. It's just that the resulting thing may be.... surprising in the results it gives. But it still ends up sorted, so all our upper_bound queries should still work, etc etc. I don't think we assert on anything other than "it isn't empty" and "it is sorted" which should hold regardless.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:561
+      auto ParentIntervalIdx = ParentIntervalsBeginIdx;
+      for (int i = DieIntervalsBeginIdx, e = DieIntervalsEndIdx - 1; i < e;
+           ++i) {
----------------
dblaikie wrote:
> aprantl wrote:
> > LLVM style wants this to be `I` and `E`.
> Nit: Usually I see "i != e" in C++ code. Is < preferred?
If these were iterators, I would use `I` and `E`. We are increasingly commonly using `i` for an `int` for loop variable.

I used `e` but am not super happy about it. More typically we have something like `Size` which makes `i < Size` much nicer (IMO). Here, the end point isn't the size and I didn't come up with a good name for it. But the `!=` I think largely comes from iterators and unsigned integers.... `<` seems much more clear for "normal" signed integers.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:690
+  return (J == SPInfo.InlinedSubroutineDIEAddrMap.begin() ||
+          (--J)->second == -1)
+             ? SPInfo.SubprogramDIE
----------------
aprantl wrote:
> That `--J` looks scary since we are using J later in the same expression.
> Perhaps factor this into a separate statement?
Yeah, never was super happy w/ this.  I'm just exploding it and dealing w/ repeated code.

https://reviews.llvm.org/D40987