[llvm] r321345 - Rewrite the cached map used for locating the most precise DIE among

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 27 14:22:39 PST 2018


OK, think I have an easily reproducible test case now:

void f1();
inline __attribute__((always_inline)) void f2() {
  f1();
  f1();
}
inline __attribute__((always_inline)) void f3() {
  f2();
  f1();
  f1();
}
void f4() { f3(); }

Compile that to IR (letting the always-inliner inline things), then swap
the middle two calls to f1() (so it goes: call from inlined f2, call from
inlined f3, call from inlined f2, call from inlined f3)

Then symbolizer the address of the 4th call to f1. With this change, it
isn't attributed to any inlining. Without this change it's correctly
attributed to f3.

I'll keep working on debugging this further - but there it is, again, in
case anything pops out at you.

- Dave

On Tue, Feb 27, 2018 at 2:09 PM David Blaikie <dblaikie at gmail.com> wrote:

> I'm still debugging this, but wondering if this data set might spark some
> insight for you based on your implementation:
>
> Here are the ranges of the inlined subroutines I'm dealing with:
>
> [3f0, f89)
>   [3f0, 64b)
>     [460, 5eb)
>       [460, 533)
>         [460, 533)
>           [4bf, 4c3)
>           [4e2, 4e7)
>
> The address being symbolized is 4e7. It ends up being assigned no inlining
> at all - as though it was not nested within the first 5 inlined subroutines.
>
> I've tested a variety of values, and it looks like only the [4e7, 533)
> range is not being symbolized correctly.
>
> Does anything stand out to you - perhaps this gives you enough of a hint
> to sniff out what the bug is?
>
> I'll continue working on it (: Might be able to reduce this a bit further
> to make more sense.
>
> On Mon, Feb 12, 2018 at 5:56 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>> Currently reverted in r324981 due to some cases of missing inlining in
>> symbolized results. Will work on getting this sorted out and recommitted
>> with tests+fixes as soon as possible.
>>
>> On Thu, Dec 21, 2017 at 10:42 PM Chandler Carruth via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>> Author: chandlerc
>>> Date: Thu Dec 21 22:41:23 2017
>>> New Revision: 321345
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=321345&view=rev
>>> Log:
>>> Rewrite the cached map used for locating the most precise DIE among
>>> inlined subroutines for a given address.
>>>
>>> This is essentially the hot path of llvm-symbolizer when extracting
>>> inlined frames during symbolization. Previously, we would read every
>>> subprogram and every inlined subroutine, building a std::map across the
>>> entire PC space to the best DIE, and then do only a handful of queries
>>> as we symbolized a backtrace. A huge fraction of the time was spent
>>> building the map itself.
>>>
>>> This patch changes it two a two-level system. First, we just build a map
>>> from PC-interval to DWARF subprograms. These are required to be disjoint
>>> and so constructing this is pretty easy. Second, we build a map *just*
>>> for the inlined subroutines within the subprogram containing the query
>>> address. This allows us to look at far fewer DIEs and build a *much*
>>> smaller set of cached maps in the llvm-symbolizer case where only a few
>>> address get symbolized during the entire run.
>>>
>>> It also builds both interval maps in a very different way. It constructs
>>> a single flat vector of pairs that maps from offset -> index. The
>>> indices point into collections of DIE objects, but can also be
>>> "tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
>>> just simplifies the data structure a bit. For inlined subroutines,
>>> because we carefully split them as we build the map, we end up in many
>>> cases having no holes and not having to store both start and stop
>>> offsets.
>>>
>>> Finally, the PC ranges for the inlined subroutines are compressed into
>>> 32-bits by making them relative to the base PC of the outer subprogram.
>>> This means that if you have a single function body with over 2gb of
>>> executable code in it, we will stop mapping address past the first 2gb
>>> of that function into inlined subroutines and just give you the
>>> subprogram. This doesn't seem like a problem. ;]
>>>
>>> All of this combines to make llvm-symbolizer *well* over 2x faster for
>>> symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
>>> tests are running >2x faster. I'm still going to look at completely
>>> disabling symbolization there, but figured while I had a good benchmark
>>> we should make symbolization a bit better.
>>>
>>> Sadly, the logic to build the flat interval map for the inlined
>>> subroutines is fairly complex. I'm not super happy about this and
>>> welcome any simplifying suggestions.
>>>
>>> Huge thanks to Dave Blaikie who helped walk me through what the various
>>> things I needed to do in DWARF to make this work.
>>>
>>> Differential Revision: https://reviews.llvm.org/D40987
>>>
>>> Modified:
>>>     llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h
>>>     llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp
>>>
>>> Modified: llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h?rev=321345&r1=321344&r2=321345&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h (original)
>>> +++ llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h Thu Dec 21
>>> 22:41:23 2017
>>> @@ -220,10 +220,40 @@ class DWARFUnit {
>>>    /// The compile unit debug information entry items.
>>>    std::vector<DWARFDebugInfoEntry> DieArray;
>>>
>>> -  /// Map from range's start address to end address and corresponding
>>> DIE.
>>> -  /// IntervalMap does not support range removal, as a result, we use
>>> the
>>> -  /// std::map::upper_bound for address range lookup.
>>> -  std::map<uint64_t, std::pair<uint64_t, DWARFDie>> AddrDieMap;
>>> +  /// The vector of inlined subroutine DIEs that we can map directly to
>>> from
>>> +  /// their subprogram below.
>>> +  std::vector<DWARFDie> InlinedSubroutineDIEs;
>>> +
>>> +  /// A type representing a subprogram DIE and a map (built using a
>>> sorted
>>> +  /// vector) into that subprogram's inlined subroutine DIEs.
>>> +  struct SubprogramDIEAddrInfo {
>>> +    DWARFDie SubprogramDIE;
>>> +
>>> +    uint64_t SubprogramBasePC;
>>> +
>>> +    /// A vector sorted to allow mapping from a relative PC to the
>>> inlined
>>> +    /// subroutine DIE with the most specific address range covering
>>> that PC.
>>> +    ///
>>> +    /// The PCs are relative to the `SubprogramBasePC`.
>>> +    ///
>>> +    /// The vector is sorted in ascending order of the first int which
>>> +    /// represents the relative PC for an interval in the map. The
>>> second int
>>> +    /// represents the index into the `InlinedSubroutineDIEs` vector of
>>> the DIE
>>> +    /// that interval maps to. An index of '-1` indicates an empty
>>> mapping. The
>>> +    /// interval covered is from the `.first` relative PC to the next
>>> entry's
>>> +    /// `.first` relative PC.
>>> +    std::vector<std::pair<uint32_t, int32_t>>
>>> InlinedSubroutineDIEAddrMap;
>>> +  };
>>> +
>>> +  /// Vector of the subprogram DIEs and their subroutine address maps.
>>> +  std::vector<SubprogramDIEAddrInfo> SubprogramDIEAddrInfos;
>>> +
>>> +  /// A vector sorted to allow mapping from a PC to the subprogram DIE
>>> (and
>>> +  /// associated addr map) index. Subprograms with overlapping PC
>>> ranges aren't
>>> +  /// supported here. Nothing will crash, but the mapping may be
>>> inaccurate.
>>> +  /// This vector may also contain "empty" ranges marked by an address
>>> with
>>> +  /// a DIE index of '-1'.
>>> +  std::vector<std::pair<uint64_t, int64_t>> SubprogramDIEAddrMap;
>>>
>>>    using die_iterator_range =
>>>        iterator_range<std::vector<DWARFDebugInfoEntry>::iterator>;
>>> @@ -282,9 +312,6 @@ public:
>>>      AddrOffsetSectionBase = Base;
>>>    }
>>>
>>> -  /// Recursively update address to Die map.
>>> -  void updateAddressDieMap(DWARFDie Die);
>>> -
>>>    void setRangesSection(const DWARFSection *RS, uint32_t Base) {
>>>      RangeSection = RS;
>>>      RangeSectionBase = Base;
>>> @@ -480,6 +507,9 @@ private:
>>>    /// parseDWO - Parses .dwo file for current compile unit. Returns
>>> true if
>>>    /// it was actually constructed.
>>>    bool parseDWO();
>>> +
>>> +  void buildSubprogramDIEAddrMap();
>>> +  void buildInlinedSubroutineDIEAddrMap(SubprogramDIEAddrInfo &SPInfo);
>>>  };
>>>
>>>  } // end namespace llvm
>>>
>>> Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp?rev=321345&r1=321344&r2=321345&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp (original)
>>> +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp Thu Dec 21 22:41:23 2017
>>> @@ -8,6 +8,7 @@
>>>
>>>  //===----------------------------------------------------------------------===//
>>>
>>>  #include "llvm/DebugInfo/DWARF/DWARFUnit.h"
>>> +#include "llvm/ADT/STLExtras.h"
>>>  #include "llvm/ADT/SmallString.h"
>>>  #include "llvm/ADT/StringRef.h"
>>>  #include "llvm/DebugInfo/DWARF/DWARFAbbreviationDeclaration.h"
>>> @@ -359,45 +360,378 @@ void DWARFUnit::collectAddressRanges(DWA
>>>      clearDIEs(true);
>>>  }
>>>
>>> -void DWARFUnit::updateAddressDieMap(DWARFDie Die) {
>>> -  if (Die.isSubroutineDIE()) {
>>> +// Populates a map from PC addresses to subprogram DIEs.
>>> +//
>>> +// This routine tries to look at the smallest amount of the debug info
>>> it can
>>> +// to locate the DIEs. This is because many subprograms will never end
>>> up being
>>> +// read or needed at all. We want to be as lazy as possible.
>>> +void DWARFUnit::buildSubprogramDIEAddrMap() {
>>> +  assert(SubprogramDIEAddrMap.empty() && "Must only build this map
>>> once!");
>>> +  SmallVector<DWARFDie, 16> Worklist;
>>> +  Worklist.push_back(getUnitDIE());
>>> +  do {
>>> +    DWARFDie Die = Worklist.pop_back_val();
>>> +
>>> +    // Queue up child DIEs to recurse through.
>>> +    // FIXME: This causes us to read a lot more debug info than we
>>> really need.
>>> +    // We should look at pruning out DIEs which cannot transitively hold
>>> +    // separate subprograms.
>>> +    for (DWARFDie Child : Die.children())
>>> +      Worklist.push_back(Child);
>>> +
>>> +    // If handling a non-subprogram DIE, nothing else to do.
>>> +    if (!Die.isSubprogramDIE())
>>> +      continue;
>>> +
>>> +    // For subprogram DIEs, store them, and insert relevant markers
>>> into the
>>> +    // address map. We don't care about overlap at all here as DWARF
>>> doesn't
>>> +    // meaningfully support that, so we simply will insert a range with
>>> no DIE
>>> +    // starting from the high PC. In the event there are overlaps,
>>> sorting
>>> +    // these may truncate things in surprising ways but still will allow
>>> +    // lookups to proceed.
>>> +    int DIEIndex = SubprogramDIEAddrInfos.size();
>>> +    SubprogramDIEAddrInfos.push_back({Die, (uint64_t)-1, {}});
>>>      for (const auto &R : Die.getAddressRanges()) {
>>>        // Ignore 0-sized ranges.
>>>        if (R.LowPC == R.HighPC)
>>>          continue;
>>> -      auto B = AddrDieMap.upper_bound(R.LowPC);
>>> -      if (B != AddrDieMap.begin() && R.LowPC < (--B)->second.first) {
>>> -        // The range is a sub-range of existing ranges, we need to
>>> split the
>>> -        // existing range.
>>> -        if (R.HighPC < B->second.first)
>>> -          AddrDieMap[R.HighPC] = B->second;
>>> -        if (R.LowPC > B->first)
>>> -          AddrDieMap[B->first].first = R.LowPC;
>>> +
>>> +      SubprogramDIEAddrMap.push_back({R.LowPC, DIEIndex});
>>> +      SubprogramDIEAddrMap.push_back({R.HighPC, -1});
>>> +
>>> +      if (R.LowPC < SubprogramDIEAddrInfos.back().SubprogramBasePC)
>>> +        SubprogramDIEAddrInfos.back().SubprogramBasePC = R.LowPC;
>>> +    }
>>> +  } while (!Worklist.empty());
>>> +
>>> +  if (SubprogramDIEAddrMap.empty()) {
>>> +    // If we found no ranges, create a no-op map so that lookups remain
>>> simple
>>> +    // but never find anything.
>>> +    SubprogramDIEAddrMap.push_back({0, -1});
>>> +    return;
>>> +  }
>>> +
>>> +  // Next, sort the ranges and remove both exact duplicates and runs
>>> with the
>>> +  // same DIE index. We order the ranges so that non-empty ranges are
>>> +  // preferred. Because there may be ties, we also need to use stable
>>> sort.
>>> +  std::stable_sort(SubprogramDIEAddrMap.begin(),
>>> SubprogramDIEAddrMap.end(),
>>> +                   [](const std::pair<uint64_t, int64_t> &LHS,
>>> +                      const std::pair<uint64_t, int64_t> &RHS) {
>>> +                     if (LHS.first < RHS.first)
>>> +                       return true;
>>> +                     if (LHS.first > RHS.first)
>>> +                       return false;
>>> +
>>> +                     // For ranges that start at the same address, keep
>>> the one
>>> +                     // with a DIE.
>>> +                     if (LHS.second != -1 && RHS.second == -1)
>>> +                       return true;
>>> +
>>> +                     return false;
>>> +                   });
>>> +  SubprogramDIEAddrMap.erase(
>>> +      std::unique(SubprogramDIEAddrMap.begin(),
>>> SubprogramDIEAddrMap.end(),
>>> +                  [](const std::pair<uint64_t, int64_t> &LHS,
>>> +                     const std::pair<uint64_t, int64_t> &RHS) {
>>> +                    // If the start addresses are exactly the same, we
>>> can
>>> +                    // remove all but the first one as it is the only
>>> one that
>>> +                    // will be found and used.
>>> +                    //
>>> +                    // If the DIE indices are the same, we can "merge"
>>> the
>>> +                    // ranges by eliminating the second.
>>> +                    return LHS.first == RHS.first || LHS.second ==
>>> RHS.second;
>>> +                  }),
>>> +      SubprogramDIEAddrMap.end());
>>> +
>>> +  assert(SubprogramDIEAddrMap.back().second == -1 &&
>>> +         "The last interval must not have a DIE as each DIE's address
>>> range is "
>>> +         "bounded.");
>>> +}
>>> +
>>> +// Build the second level of mapping from PC to DIE, specifically one
>>> that maps
>>> +// a PC *within* a particular DWARF subprogram into a precise,
>>> maximally nested
>>> +// inlined subroutine DIE (if any exists). We build a separate map for
>>> each
>>> +// subprogram because many subprograms will never get queried for an
>>> address
>>> +// and this allows us to be significantly lazier in reading the DWARF
>>> itself.
>>> +void DWARFUnit::buildInlinedSubroutineDIEAddrMap(
>>> +    SubprogramDIEAddrInfo &SPInfo) {
>>> +  auto &AddrMap = SPInfo.InlinedSubroutineDIEAddrMap;
>>> +  uint64_t BasePC = SPInfo.SubprogramBasePC;
>>> +
>>> +  auto SubroutineAddrMapSorter = [](const std::pair<int, int> &LHS,
>>> +                                    const std::pair<int, int> &RHS) {
>>> +    if (LHS.first < RHS.first)
>>> +      return true;
>>> +    if (LHS.first > RHS.first)
>>> +      return false;
>>> +
>>> +    // For ranges that start at the same address, keep the
>>> +    // non-empty one.
>>> +    if (LHS.second != -1 && RHS.second == -1)
>>> +      return true;
>>> +
>>> +    return false;
>>> +  };
>>> +  auto SubroutineAddrMapUniquer = [](const std::pair<int, int> &LHS,
>>> +                                     const std::pair<int, int> &RHS) {
>>> +    // If the start addresses are exactly the same, we can
>>> +    // remove all but the first one as it is the only one that
>>> +    // will be found and used.
>>> +    //
>>> +    // If the DIE indices are the same, we can "merge" the
>>> +    // ranges by eliminating the second.
>>> +    return LHS.first == RHS.first || LHS.second == RHS.second;
>>> +  };
>>> +
>>> +  struct DieAndParentIntervalRange {
>>> +    DWARFDie Die;
>>> +    int ParentIntervalsBeginIdx, ParentIntervalsEndIdx;
>>> +  };
>>> +
>>> +  SmallVector<DieAndParentIntervalRange, 16> Worklist;
>>> +  auto EnqueueChildDIEs = [&](const DWARFDie &Die, int
>>> ParentIntervalsBeginIdx,
>>> +                              int ParentIntervalsEndIdx) {
>>> +    for (DWARFDie Child : Die.children())
>>> +      Worklist.push_back(
>>> +          {Child, ParentIntervalsBeginIdx, ParentIntervalsEndIdx});
>>> +  };
>>> +  EnqueueChildDIEs(SPInfo.SubprogramDIE, 0, 0);
>>> +  while (!Worklist.empty()) {
>>> +    DWARFDie Die = Worklist.back().Die;
>>> +    int ParentIntervalsBeginIdx =
>>> Worklist.back().ParentIntervalsBeginIdx;
>>> +    int ParentIntervalsEndIdx = Worklist.back().ParentIntervalsEndIdx;
>>> +    Worklist.pop_back();
>>> +
>>> +    // If we encounter a nested subprogram, simply ignore it. We map to
>>> +    // (disjoint) subprograms before arriving here and we don't want to
>>> examine
>>> +    // any inlined subroutines of an unrelated subpragram.
>>> +    if (Die.getTag() == DW_TAG_subprogram)
>>> +      continue;
>>> +
>>> +    // For non-subroutines, just recurse to keep searching for inlined
>>> +    // subroutines.
>>> +    if (Die.getTag() != DW_TAG_inlined_subroutine) {
>>> +      EnqueueChildDIEs(Die, ParentIntervalsBeginIdx,
>>> ParentIntervalsEndIdx);
>>> +      continue;
>>> +    }
>>> +
>>> +    // Capture the inlined subroutine DIE that we will reference from
>>> the map.
>>> +    int DIEIndex = InlinedSubroutineDIEs.size();
>>> +    InlinedSubroutineDIEs.push_back(Die);
>>> +
>>> +    int DieIntervalsBeginIdx = AddrMap.size();
>>> +    // First collect the PC ranges for this DIE into our subroutine
>>> interval
>>> +    // map.
>>> +    for (auto R : Die.getAddressRanges()) {
>>> +      // Clamp the PCs to be above the base.
>>> +      R.LowPC = std::max(R.LowPC, BasePC);
>>> +      R.HighPC = std::max(R.HighPC, BasePC);
>>> +      // Compute relative PCs from the subprogram base and drop down to
>>> an
>>> +      // unsigned 32-bit int to represent them within the data
>>> structure. This
>>> +      // lets us cover a 4gb single subprogram. Because subprograms may
>>> be
>>> +      // partitioned into distant parts of a binary (think hot/cold
>>> +      // partitioning) we want to preserve as much as we can here
>>> without
>>> +      // burning extra memory. Past that, we will simply truncate and
>>> lose the
>>> +      // ability to map those PCs to a DIE more precise than the
>>> subprogram.
>>> +      const uint32_t MaxRelativePC =
>>> std::numeric_limits<uint32_t>::max();
>>> +      uint32_t RelativeLowPC = (R.LowPC - BasePC) >
>>> (uint64_t)MaxRelativePC
>>> +                                   ? MaxRelativePC
>>> +                                   : (uint32_t)(R.LowPC - BasePC);
>>> +      uint32_t RelativeHighPC = (R.HighPC - BasePC) >
>>> (uint64_t)MaxRelativePC
>>> +                                    ? MaxRelativePC
>>> +                                    : (uint32_t)(R.HighPC - BasePC);
>>> +      // Ignore empty or bogus ranges.
>>> +      if (RelativeLowPC >= RelativeHighPC)
>>> +        continue;
>>> +      AddrMap.push_back({RelativeLowPC, DIEIndex});
>>> +      AddrMap.push_back({RelativeHighPC, -1});
>>> +    }
>>> +
>>> +    // If there are no address ranges, there is nothing to do to map
>>> into them
>>> +    // and there cannot be any child subroutine DIEs with address
>>> ranges of
>>> +    // interest as those would all be required to nest within this DIE's
>>> +    // non-existent ranges, so we can immediately continue to the next
>>> DIE in
>>> +    // the worklist.
>>> +    if (DieIntervalsBeginIdx == (int)AddrMap.size())
>>> +      continue;
>>> +
>>> +    // The PCs from this DIE should never overlap, so we can easily
>>> sort them
>>> +    // here.
>>> +    std::sort(AddrMap.begin() + DieIntervalsBeginIdx, AddrMap.end(),
>>> +              SubroutineAddrMapSorter);
>>> +    // Remove any dead ranges. These should only come from "empty"
>>> ranges that
>>> +    // were clobbered by some other range.
>>> +    AddrMap.erase(std::unique(AddrMap.begin() + DieIntervalsBeginIdx,
>>> +                              AddrMap.end(), SubroutineAddrMapUniquer),
>>> +                  AddrMap.end());
>>> +
>>> +    // Compute the end index of this DIE's addr map intervals.
>>> +    int DieIntervalsEndIdx = AddrMap.size();
>>> +
>>> +    assert(DieIntervalsBeginIdx != DieIntervalsEndIdx &&
>>> +           "Must not have an empty map for this layer!");
>>> +    assert(AddrMap.back().second == -1 && "Must end with an empty
>>> range!");
>>> +    assert(std::is_sorted(AddrMap.begin() + DieIntervalsBeginIdx,
>>> AddrMap.end(),
>>> +                          less_first()) &&
>>> +           "Failed to sort this DIE's interals!");
>>> +
>>> +    // If we have any parent intervals, walk the newly added ranges and
>>> find
>>> +    // the parent ranges they were inserted into. Both of these are
>>> sorted and
>>> +    // neither has any overlaps. We need to append new ranges to split
>>> up any
>>> +    // parent ranges these new ranges would overlap when we merge them.
>>> +    if (ParentIntervalsBeginIdx != ParentIntervalsEndIdx) {
>>> +      int ParentIntervalIdx = ParentIntervalsBeginIdx;
>>> +      for (int i = DieIntervalsBeginIdx, e = DieIntervalsEndIdx - 1; i
>>> < e;
>>> +           ++i) {
>>> +        const uint32_t IntervalStart = AddrMap[i].first;
>>> +        const uint32_t IntervalEnd = AddrMap[i + 1].first;
>>> +        const int IntervalDieIdx = AddrMap[i].second;
>>> +        if (IntervalDieIdx == -1) {
>>> +          // For empty intervals, nothing is required. This is a bit
>>> surprising
>>> +          // however. If the prior interval overlaps a parent interval
>>> and this
>>> +          // would be necessary to mark the end, we will synthesize a
>>> new end
>>> +          // that switches back to the parent DIE below. And this
>>> interval will
>>> +          // get dropped in favor of one with a DIE attached. However,
>>> we'll
>>> +          // still include this and so worst-case, it will still end
>>> the prior
>>> +          // interval.
>>> +          continue;
>>> +        }
>>> +
>>> +        // We are walking the new ranges in order, so search forward
>>> from the
>>> +        // last point for a parent range that might overlap.
>>> +        auto ParentIntervalsRange =
>>> +            make_range(AddrMap.begin() + ParentIntervalIdx,
>>> +                       AddrMap.begin() + ParentIntervalsEndIdx);
>>> +        assert(std::is_sorted(ParentIntervalsRange.begin(),
>>> +                              ParentIntervalsRange.end(), less_first())
>>> &&
>>> +               "Unsorted parent intervals can't be searched!");
>>> +        auto PI = std::upper_bound(
>>> +            ParentIntervalsRange.begin(), ParentIntervalsRange.end(),
>>> +            IntervalStart,
>>> +            [](uint32_t LHS, const std::pair<uint32_t, int32_t> &RHS) {
>>> +              return LHS < RHS.first;
>>> +            });
>>> +        if (PI == ParentIntervalsRange.begin() ||
>>> +            PI == ParentIntervalsRange.end())
>>> +          continue;
>>> +
>>> +        ParentIntervalIdx = PI - AddrMap.begin();
>>> +        int32_t &ParentIntervalDieIdx = std::prev(PI)->second;
>>> +        uint32_t &ParentIntervalStart = std::prev(PI)->first;
>>> +        const uint32_t ParentIntervalEnd = PI->first;
>>> +
>>> +        // If the new range starts exactly at the position of the
>>> parent range,
>>> +        // we need to adjust the parent range. Note that these
>>> collisions can
>>> +        // only happen with the original parent range because we will
>>> merge any
>>> +        // adjacent ranges in the child.
>>> +        if (IntervalStart == ParentIntervalStart) {
>>> +          // If there will be a tail, just shift the start of the parent
>>> +          // forward. Note that this cannot change the parent ordering.
>>> +          if (IntervalEnd < ParentIntervalEnd) {
>>> +            ParentIntervalStart = IntervalEnd;
>>> +            continue;
>>> +          }
>>> +          // Otherwise, mark this as becoming empty so we'll remove it
>>> and
>>> +          // prefer the child range.
>>> +          ParentIntervalDieIdx = -1;
>>> +          continue;
>>> +        }
>>> +
>>> +        // Finally, if the parent interval will need to remain as a
>>> prefix to
>>> +        // this one, insert a new interval to cover any tail.
>>> +        if (IntervalEnd < ParentIntervalEnd)
>>> +          AddrMap.push_back({IntervalEnd, ParentIntervalDieIdx});
>>>        }
>>> -      AddrDieMap[R.LowPC] = std::make_pair(R.HighPC, Die);
>>>      }
>>> +
>>> +    // Note that we don't need to re-sort even this DIE's address map
>>> intervals
>>> +    // after this. All of the newly added intervals actually fill in
>>> *gaps* in
>>> +    // this DIE's address map, and we know that children won't need to
>>> lookup
>>> +    // into those gaps.
>>> +
>>> +    // Recurse through its children, giving them the interval map range
>>> of this
>>> +    // DIE to use as their parent intervals.
>>> +    EnqueueChildDIEs(Die, DieIntervalsBeginIdx, DieIntervalsEndIdx);
>>>    }
>>> -  // Parent DIEs are added to the AddrDieMap prior to the Children DIEs
>>> to
>>> -  // simplify the logic to update AddrDieMap. The child's range will
>>> always
>>> -  // be equal or smaller than the parent's range. With this assumption,
>>> when
>>> -  // adding one range into the map, it will at most split a range into 3
>>> -  // sub-ranges.
>>> -  for (DWARFDie Child = Die.getFirstChild(); Child; Child =
>>> Child.getSibling())
>>> -    updateAddressDieMap(Child);
>>> +
>>> +  if (AddrMap.empty()) {
>>> +    AddrMap.push_back({0, -1});
>>> +    return;
>>> +  }
>>> +
>>> +  // Now that we've added all of the intervals needed, we need to
>>> resort and
>>> +  // unique them. Most notably, this will remove all the empty ranges
>>> that had
>>> +  // a parent range covering, etc. We only expect a single non-empty
>>> interval
>>> +  // at any given start point, so we just use std::sort. This could
>>> potentially
>>> +  // produce non-deterministic maps for invalid DWARF.
>>> +  std::sort(AddrMap.begin(), AddrMap.end(), SubroutineAddrMapSorter);
>>> +  AddrMap.erase(
>>> +      std::unique(AddrMap.begin(), AddrMap.end(),
>>> SubroutineAddrMapUniquer),
>>> +      AddrMap.end());
>>>  }
>>>
>>>  DWARFDie DWARFUnit::getSubroutineForAddress(uint64_t Address) {
>>>    extractDIEsIfNeeded(false);
>>> -  if (AddrDieMap.empty())
>>> -    updateAddressDieMap(getUnitDIE());
>>> -  auto R = AddrDieMap.upper_bound(Address);
>>> -  if (R == AddrDieMap.begin())
>>> +
>>> +  // We use a two-level mapping structure to locate subroutines for a
>>> given PC
>>> +  // address.
>>> +  //
>>> +  // First, we map the address to a subprogram. This can be done more
>>> cheaply
>>> +  // because subprograms cannot nest within each other. It also allows
>>> us to
>>> +  // avoid detailed examination of many subprograms, instead only
>>> focusing on
>>> +  // the ones which we end up actively querying.
>>> +  if (SubprogramDIEAddrMap.empty())
>>> +    buildSubprogramDIEAddrMap();
>>> +
>>> +  assert(!SubprogramDIEAddrMap.empty() &&
>>> +         "We must always end up with a non-empty map!");
>>> +
>>> +  auto I = std::upper_bound(
>>> +      SubprogramDIEAddrMap.begin(), SubprogramDIEAddrMap.end(), Address,
>>> +      [](uint64_t LHS, const std::pair<uint64_t, int64_t> &RHS) {
>>> +        return LHS < RHS.first;
>>> +      });
>>> +  // If we find the beginning, then the address is before the first
>>> subprogram.
>>> +  if (I == SubprogramDIEAddrMap.begin())
>>>      return DWARFDie();
>>> -  // upper_bound's previous item contains Address.
>>> -  --R;
>>> -  if (Address >= R->second.first)
>>> +  // Back up to the interval containing the address and see if it
>>> +  // has a DIE associated with it.
>>> +  --I;
>>> +  if (I->second == -1)
>>>      return DWARFDie();
>>> -  return R->second.second;
>>> +
>>> +  auto &SPInfo = SubprogramDIEAddrInfos[I->second];
>>> +
>>> +  // Now that we have the subprogram for this address, we do the second
>>> level
>>> +  // mapping by building a map within a subprogram's PC range to any
>>> specific
>>> +  // inlined subroutine.
>>> +  if (SPInfo.InlinedSubroutineDIEAddrMap.empty())
>>> +    buildInlinedSubroutineDIEAddrMap(SPInfo);
>>> +
>>> +  // We lookup within the inlined subroutine using a subprogram-relative
>>> +  // address.
>>> +  assert(Address >= SPInfo.SubprogramBasePC &&
>>> +         "Address isn't above the start of the subprogram!");
>>> +  uint32_t RelativeAddr = ((Address - SPInfo.SubprogramBasePC) >
>>> +
>>>  (uint64_t)std::numeric_limits<uint32_t>::max())
>>> +                              ? std::numeric_limits<uint32_t>::max()
>>> +                              : (uint32_t)(Address -
>>> SPInfo.SubprogramBasePC);
>>> +
>>> +  auto J =
>>> +      std::upper_bound(SPInfo.InlinedSubroutineDIEAddrMap.begin(),
>>> +                       SPInfo.InlinedSubroutineDIEAddrMap.end(),
>>> RelativeAddr,
>>> +                       [](uint32_t LHS, const std::pair<uint32_t,
>>> int32_t> &RHS) {
>>> +                         return LHS < RHS.first;
>>> +                       });
>>> +  // If we find the beginning, the address is before any inlined
>>> subroutine so
>>> +  // return the subprogram DIE.
>>> +  if (J == SPInfo.InlinedSubroutineDIEAddrMap.begin())
>>> +    return SPInfo.SubprogramDIE;
>>> +  // Back up `J` and return the inlined subroutine if we have one or the
>>> +  // subprogram if we don't.
>>> +  --J;
>>> +  return J->second == -1 ? SPInfo.SubprogramDIE
>>> +                         : InlinedSubroutineDIEs[J->second];
>>>  }
>>>
>>>  void
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180227/9ab2241a/attachment.html>


More information about the llvm-commits mailing list