[llvm] r321345 - Rewrite the cached map used for locating the most precise DIE among
David Blaikie via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 28 16:24:54 PST 2018
As discussed over lunch, it looks like somewhere around the section with
the comment "If the new range starts exactly at the position of the parent
range" - that logic may need to iterate across previous siblings to see if,
once the parent range is adjusted, the condition (IntervalStart ==
ParentIntervalStart) becomes true for some previously visited sibling.
I tried this:
@@ -629,6 +629,14 @@ void DWARFUnit::buildInlinedSubroutineDIEAddrMap(
// forward. Note that this cannot change the parent ordering.
if (IntervalEnd < ParentIntervalEnd) {
ParentIntervalStart = IntervalEnd;
+ for (int j = ParentIntervalIdx + 2; j != i; ++j) {
+ const uint32_t SiblingStart = AddrMap[j].first;
+ const uint32_t SiblingEnd = AddrMap[j + 1].first;
+ if (AddrMap[j].second == -1)
+ continue;
+ if (SiblingStart == ParentIntervalStart)
+ ParentIntervalStart = SiblingEnd;
+ }
continue;
}
// Otherwise, mark this as becoming empty so we'll remove it and
But that seems insufficient - it regresses (compared to without the above
loop) the following example:
void f1();
inline __attribute__((always_inline)) void f2() { f1(); }
inline __attribute__((always_inline)) void f3() {
f2();
f2();
f1();
}
void f5() {
f3();
}
because one of the f2 inlinings causes this case to fire:
// Finally, if the parent interval will need to remain as a prefix
to
// this one, insert a new interval to cover any tail.
if (IntervalEnd < ParentIntervalEnd)
AddrMap.push_back({IntervalEnd, ParentIntervalDieIdx});
Which then means the important interval to examine/truncate when the second
f2 inlining is being handled, isn't part of the main parent offsets - it's
at the end of the list & I'm not sure it can be readily found/distinguished
from other things.
So that's about where I'm giving up for now - hopefully this is enough
state for you/I/someone else to pick this up another time.
On Wed, Feb 28, 2018 at 10:20 AM David Blaikie <dblaikie at gmail.com> wrote:
> This change seems to fix the above example:
>
> diff --git lib/DebugInfo/DWARF/DWARFUnit.cpp
> lib/DebugInfo/DWARF/DWARFUnit.cpp
> index df55d7debf9..6c53c2a2b04 100644
> --- lib/DebugInfo/DWARF/DWARFUnit.cpp
> +++ lib/DebugInfo/DWARF/DWARFUnit.cpp
> @@ -615,7 +615,7 @@ void DWARFUnit::buildInlinedSubroutineDIEAddrMap(
> PI == ParentIntervalsRange.end())
> continue;
>
> - ParentIntervalIdx = PI - AddrMap.begin();
> + ParentIntervalIdx = PI - AddrMap.begin() - 1;
> int32_t &ParentIntervalDieIdx = std::prev(PI)->second;
> uint32_t &ParentIntervalStart = std::prev(PI)->first;
> const uint32_t ParentIntervalEnd = PI->first;
>
> Though I don't fully understand the code well enough to have entire
> confidence in this change, though it makes some sense to me (perhaps not
> enough to write it down well - but I can gesticulate wildly in person to
> try to hand wave through my rough understanding).
>
> But I also tried doing a bit more A/B testing against the symbolizer
> without this change & found another issue:
>
>
> void f1();
> inline __attribute__((always_inline)) void f2() { f1(); }
> inline __attribute__((always_inline)) void f3() { f1(); }
> inline __attribute__((always_inline)) void f4() {
> f2();
> f3();
> }
> void f5() {
> f4();
> }
>
> Then symbolize any address in the call to f3 (which, at least on my
> machine, is the range [0x9, 0xe)) - it misses the f3 portion of the stack,
> and only gives f4 and f5.
>
> I /think/ I can explain this one a bit better, but again, maybe easier in
> person.
>
> On Tue, Feb 27, 2018 at 2:22 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>> OK, think I have an easily reproducible test case now:
>>
>> void f1();
>> inline __attribute__((always_inline)) void f2() {
>> f1();
>> f1();
>> }
>> inline __attribute__((always_inline)) void f3() {
>> f2();
>> f1();
>> f1();
>> }
>> void f4() { f3(); }
>>
>> Compile that to IR (letting the always-inliner inline things), then swap
>> the middle two calls to f1() (so it goes: call from inlined f2, call
>> from inlined f3, call from inlined f2, call from inlined f3)
>>
>> Then symbolizer the address of the 4th call to f1. With this change, it
>> isn't attributed to any inlining. Without this change it's correctly
>> attributed to f3.
>>
>> I'll keep working on debugging this further - but there it is, again, in
>> case anything pops out at you.
>>
>> - Dave
>>
>> On Tue, Feb 27, 2018 at 2:09 PM David Blaikie <dblaikie at gmail.com> wrote:
>>
>>> I'm still debugging this, but wondering if this data set might spark
>>> some insight for you based on your implementation:
>>>
>>> Here are the ranges of the inlined subroutines I'm dealing with:
>>>
>>> [3f0, f89)
>>> [3f0, 64b)
>>> [460, 5eb)
>>> [460, 533)
>>> [460, 533)
>>> [4bf, 4c3)
>>> [4e2, 4e7)
>>>
>>> The address being symbolized is 4e7. It ends up being assigned no
>>> inlining at all - as though it was not nested within the first 5 inlined
>>> subroutines.
>>>
>>> I've tested a variety of values, and it looks like only the [4e7, 533)
>>> range is not being symbolized correctly.
>>>
>>> Does anything stand out to you - perhaps this gives you enough of a hint
>>> to sniff out what the bug is?
>>>
>>> I'll continue working on it (: Might be able to reduce this a bit
>>> further to make more sense.
>>>
>>> On Mon, Feb 12, 2018 at 5:56 PM David Blaikie <dblaikie at gmail.com>
>>> wrote:
>>>
>>>> Currently reverted in r324981 due to some cases of missing inlining in
>>>> symbolized results. Will work on getting this sorted out and recommitted
>>>> with tests+fixes as soon as possible.
>>>>
>>>> On Thu, Dec 21, 2017 at 10:42 PM Chandler Carruth via llvm-commits <
>>>> llvm-commits at lists.llvm.org> wrote:
>>>>
>>>>> Author: chandlerc
>>>>> Date: Thu Dec 21 22:41:23 2017
>>>>> New Revision: 321345
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=321345&view=rev
>>>>> Log:
>>>>> Rewrite the cached map used for locating the most precise DIE among
>>>>> inlined subroutines for a given address.
>>>>>
>>>>> This is essentially the hot path of llvm-symbolizer when extracting
>>>>> inlined frames during symbolization. Previously, we would read every
>>>>> subprogram and every inlined subroutine, building a std::map across the
>>>>> entire PC space to the best DIE, and then do only a handful of queries
>>>>> as we symbolized a backtrace. A huge fraction of the time was spent
>>>>> building the map itself.
>>>>>
>>>>> This patch changes it two a two-level system. First, we just build a
>>>>> map
>>>>> from PC-interval to DWARF subprograms. These are required to be
>>>>> disjoint
>>>>> and so constructing this is pretty easy. Second, we build a map *just*
>>>>> for the inlined subroutines within the subprogram containing the query
>>>>> address. This allows us to look at far fewer DIEs and build a *much*
>>>>> smaller set of cached maps in the llvm-symbolizer case where only a few
>>>>> address get symbolized during the entire run.
>>>>>
>>>>> It also builds both interval maps in a very different way. It
>>>>> constructs
>>>>> a single flat vector of pairs that maps from offset -> index. The
>>>>> indices point into collections of DIE objects, but can also be
>>>>> "tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
>>>>> just simplifies the data structure a bit. For inlined subroutines,
>>>>> because we carefully split them as we build the map, we end up in many
>>>>> cases having no holes and not having to store both start and stop
>>>>> offsets.
>>>>>
>>>>> Finally, the PC ranges for the inlined subroutines are compressed into
>>>>> 32-bits by making them relative to the base PC of the outer subprogram.
>>>>> This means that if you have a single function body with over 2gb of
>>>>> executable code in it, we will stop mapping address past the first 2gb
>>>>> of that function into inlined subroutines and just give you the
>>>>> subprogram. This doesn't seem like a problem. ;]
>>>>>
>>>>> All of this combines to make llvm-symbolizer *well* over 2x faster for
>>>>> symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
>>>>> tests are running >2x faster. I'm still going to look at completely
>>>>> disabling symbolization there, but figured while I had a good benchmark
>>>>> we should make symbolization a bit better.
>>>>>
>>>>> Sadly, the logic to build the flat interval map for the inlined
>>>>> subroutines is fairly complex. I'm not super happy about this and
>>>>> welcome any simplifying suggestions.
>>>>>
>>>>> Huge thanks to Dave Blaikie who helped walk me through what the various
>>>>> things I needed to do in DWARF to make this work.
>>>>>
>>>>> Differential Revision: https://reviews.llvm.org/D40987
>>>>>
>>>>> Modified:
>>>>> llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h
>>>>> llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp
>>>>>
>>>>> Modified: llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h?rev=321345&r1=321344&r2=321345&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h (original)
>>>>> +++ llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFUnit.h Thu Dec 21
>>>>> 22:41:23 2017
>>>>> @@ -220,10 +220,40 @@ class DWARFUnit {
>>>>> /// The compile unit debug information entry items.
>>>>> std::vector<DWARFDebugInfoEntry> DieArray;
>>>>>
>>>>> - /// Map from range's start address to end address and corresponding
>>>>> DIE.
>>>>> - /// IntervalMap does not support range removal, as a result, we use
>>>>> the
>>>>> - /// std::map::upper_bound for address range lookup.
>>>>> - std::map<uint64_t, std::pair<uint64_t, DWARFDie>> AddrDieMap;
>>>>> + /// The vector of inlined subroutine DIEs that we can map directly
>>>>> to from
>>>>> + /// their subprogram below.
>>>>> + std::vector<DWARFDie> InlinedSubroutineDIEs;
>>>>> +
>>>>> + /// A type representing a subprogram DIE and a map (built using a
>>>>> sorted
>>>>> + /// vector) into that subprogram's inlined subroutine DIEs.
>>>>> + struct SubprogramDIEAddrInfo {
>>>>> + DWARFDie SubprogramDIE;
>>>>> +
>>>>> + uint64_t SubprogramBasePC;
>>>>> +
>>>>> + /// A vector sorted to allow mapping from a relative PC to the
>>>>> inlined
>>>>> + /// subroutine DIE with the most specific address range covering
>>>>> that PC.
>>>>> + ///
>>>>> + /// The PCs are relative to the `SubprogramBasePC`.
>>>>> + ///
>>>>> + /// The vector is sorted in ascending order of the first int which
>>>>> + /// represents the relative PC for an interval in the map. The
>>>>> second int
>>>>> + /// represents the index into the `InlinedSubroutineDIEs` vector
>>>>> of the DIE
>>>>> + /// that interval maps to. An index of '-1` indicates an empty
>>>>> mapping. The
>>>>> + /// interval covered is from the `.first` relative PC to the next
>>>>> entry's
>>>>> + /// `.first` relative PC.
>>>>> + std::vector<std::pair<uint32_t, int32_t>>
>>>>> InlinedSubroutineDIEAddrMap;
>>>>> + };
>>>>> +
>>>>> + /// Vector of the subprogram DIEs and their subroutine address maps.
>>>>> + std::vector<SubprogramDIEAddrInfo> SubprogramDIEAddrInfos;
>>>>> +
>>>>> + /// A vector sorted to allow mapping from a PC to the subprogram
>>>>> DIE (and
>>>>> + /// associated addr map) index. Subprograms with overlapping PC
>>>>> ranges aren't
>>>>> + /// supported here. Nothing will crash, but the mapping may be
>>>>> inaccurate.
>>>>> + /// This vector may also contain "empty" ranges marked by an
>>>>> address with
>>>>> + /// a DIE index of '-1'.
>>>>> + std::vector<std::pair<uint64_t, int64_t>> SubprogramDIEAddrMap;
>>>>>
>>>>> using die_iterator_range =
>>>>> iterator_range<std::vector<DWARFDebugInfoEntry>::iterator>;
>>>>> @@ -282,9 +312,6 @@ public:
>>>>> AddrOffsetSectionBase = Base;
>>>>> }
>>>>>
>>>>> - /// Recursively update address to Die map.
>>>>> - void updateAddressDieMap(DWARFDie Die);
>>>>> -
>>>>> void setRangesSection(const DWARFSection *RS, uint32_t Base) {
>>>>> RangeSection = RS;
>>>>> RangeSectionBase = Base;
>>>>> @@ -480,6 +507,9 @@ private:
>>>>> /// parseDWO - Parses .dwo file for current compile unit. Returns
>>>>> true if
>>>>> /// it was actually constructed.
>>>>> bool parseDWO();
>>>>> +
>>>>> + void buildSubprogramDIEAddrMap();
>>>>> + void buildInlinedSubroutineDIEAddrMap(SubprogramDIEAddrInfo
>>>>> &SPInfo);
>>>>> };
>>>>>
>>>>> } // end namespace llvm
>>>>>
>>>>> Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp?rev=321345&r1=321344&r2=321345&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> --- llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp (original)
>>>>> +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFUnit.cpp Thu Dec 21 22:41:23
>>>>> 2017
>>>>> @@ -8,6 +8,7 @@
>>>>>
>>>>> //===----------------------------------------------------------------------===//
>>>>>
>>>>> #include "llvm/DebugInfo/DWARF/DWARFUnit.h"
>>>>> +#include "llvm/ADT/STLExtras.h"
>>>>> #include "llvm/ADT/SmallString.h"
>>>>> #include "llvm/ADT/StringRef.h"
>>>>> #include "llvm/DebugInfo/DWARF/DWARFAbbreviationDeclaration.h"
>>>>> @@ -359,45 +360,378 @@ void DWARFUnit::collectAddressRanges(DWA
>>>>> clearDIEs(true);
>>>>> }
>>>>>
>>>>> -void DWARFUnit::updateAddressDieMap(DWARFDie Die) {
>>>>> - if (Die.isSubroutineDIE()) {
>>>>> +// Populates a map from PC addresses to subprogram DIEs.
>>>>> +//
>>>>> +// This routine tries to look at the smallest amount of the debug
>>>>> info it can
>>>>> +// to locate the DIEs. This is because many subprograms will never
>>>>> end up being
>>>>> +// read or needed at all. We want to be as lazy as possible.
>>>>> +void DWARFUnit::buildSubprogramDIEAddrMap() {
>>>>> + assert(SubprogramDIEAddrMap.empty() && "Must only build this map
>>>>> once!");
>>>>> + SmallVector<DWARFDie, 16> Worklist;
>>>>> + Worklist.push_back(getUnitDIE());
>>>>> + do {
>>>>> + DWARFDie Die = Worklist.pop_back_val();
>>>>> +
>>>>> + // Queue up child DIEs to recurse through.
>>>>> + // FIXME: This causes us to read a lot more debug info than we
>>>>> really need.
>>>>> + // We should look at pruning out DIEs which cannot transitively
>>>>> hold
>>>>> + // separate subprograms.
>>>>> + for (DWARFDie Child : Die.children())
>>>>> + Worklist.push_back(Child);
>>>>> +
>>>>> + // If handling a non-subprogram DIE, nothing else to do.
>>>>> + if (!Die.isSubprogramDIE())
>>>>> + continue;
>>>>> +
>>>>> + // For subprogram DIEs, store them, and insert relevant markers
>>>>> into the
>>>>> + // address map. We don't care about overlap at all here as DWARF
>>>>> doesn't
>>>>> + // meaningfully support that, so we simply will insert a range
>>>>> with no DIE
>>>>> + // starting from the high PC. In the event there are overlaps,
>>>>> sorting
>>>>> + // these may truncate things in surprising ways but still will
>>>>> allow
>>>>> + // lookups to proceed.
>>>>> + int DIEIndex = SubprogramDIEAddrInfos.size();
>>>>> + SubprogramDIEAddrInfos.push_back({Die, (uint64_t)-1, {}});
>>>>> for (const auto &R : Die.getAddressRanges()) {
>>>>> // Ignore 0-sized ranges.
>>>>> if (R.LowPC == R.HighPC)
>>>>> continue;
>>>>> - auto B = AddrDieMap.upper_bound(R.LowPC);
>>>>> - if (B != AddrDieMap.begin() && R.LowPC < (--B)->second.first) {
>>>>> - // The range is a sub-range of existing ranges, we need to
>>>>> split the
>>>>> - // existing range.
>>>>> - if (R.HighPC < B->second.first)
>>>>> - AddrDieMap[R.HighPC] = B->second;
>>>>> - if (R.LowPC > B->first)
>>>>> - AddrDieMap[B->first].first = R.LowPC;
>>>>> +
>>>>> + SubprogramDIEAddrMap.push_back({R.LowPC, DIEIndex});
>>>>> + SubprogramDIEAddrMap.push_back({R.HighPC, -1});
>>>>> +
>>>>> + if (R.LowPC < SubprogramDIEAddrInfos.back().SubprogramBasePC)
>>>>> + SubprogramDIEAddrInfos.back().SubprogramBasePC = R.LowPC;
>>>>> + }
>>>>> + } while (!Worklist.empty());
>>>>> +
>>>>> + if (SubprogramDIEAddrMap.empty()) {
>>>>> + // If we found no ranges, create a no-op map so that lookups
>>>>> remain simple
>>>>> + // but never find anything.
>>>>> + SubprogramDIEAddrMap.push_back({0, -1});
>>>>> + return;
>>>>> + }
>>>>> +
>>>>> + // Next, sort the ranges and remove both exact duplicates and runs
>>>>> with the
>>>>> + // same DIE index. We order the ranges so that non-empty ranges are
>>>>> + // preferred. Because there may be ties, we also need to use stable
>>>>> sort.
>>>>> + std::stable_sort(SubprogramDIEAddrMap.begin(),
>>>>> SubprogramDIEAddrMap.end(),
>>>>> + [](const std::pair<uint64_t, int64_t> &LHS,
>>>>> + const std::pair<uint64_t, int64_t> &RHS) {
>>>>> + if (LHS.first < RHS.first)
>>>>> + return true;
>>>>> + if (LHS.first > RHS.first)
>>>>> + return false;
>>>>> +
>>>>> + // For ranges that start at the same address,
>>>>> keep the one
>>>>> + // with a DIE.
>>>>> + if (LHS.second != -1 && RHS.second == -1)
>>>>> + return true;
>>>>> +
>>>>> + return false;
>>>>> + });
>>>>> + SubprogramDIEAddrMap.erase(
>>>>> + std::unique(SubprogramDIEAddrMap.begin(),
>>>>> SubprogramDIEAddrMap.end(),
>>>>> + [](const std::pair<uint64_t, int64_t> &LHS,
>>>>> + const std::pair<uint64_t, int64_t> &RHS) {
>>>>> + // If the start addresses are exactly the same,
>>>>> we can
>>>>> + // remove all but the first one as it is the only
>>>>> one that
>>>>> + // will be found and used.
>>>>> + //
>>>>> + // If the DIE indices are the same, we can
>>>>> "merge" the
>>>>> + // ranges by eliminating the second.
>>>>> + return LHS.first == RHS.first || LHS.second ==
>>>>> RHS.second;
>>>>> + }),
>>>>> + SubprogramDIEAddrMap.end());
>>>>> +
>>>>> + assert(SubprogramDIEAddrMap.back().second == -1 &&
>>>>> + "The last interval must not have a DIE as each DIE's address
>>>>> range is "
>>>>> + "bounded.");
>>>>> +}
>>>>> +
>>>>> +// Build the second level of mapping from PC to DIE, specifically one
>>>>> that maps
>>>>> +// a PC *within* a particular DWARF subprogram into a precise,
>>>>> maximally nested
>>>>> +// inlined subroutine DIE (if any exists). We build a separate map
>>>>> for each
>>>>> +// subprogram because many subprograms will never get queried for an
>>>>> address
>>>>> +// and this allows us to be significantly lazier in reading the DWARF
>>>>> itself.
>>>>> +void DWARFUnit::buildInlinedSubroutineDIEAddrMap(
>>>>> + SubprogramDIEAddrInfo &SPInfo) {
>>>>> + auto &AddrMap = SPInfo.InlinedSubroutineDIEAddrMap;
>>>>> + uint64_t BasePC = SPInfo.SubprogramBasePC;
>>>>> +
>>>>> + auto SubroutineAddrMapSorter = [](const std::pair<int, int> &LHS,
>>>>> + const std::pair<int, int> &RHS) {
>>>>> + if (LHS.first < RHS.first)
>>>>> + return true;
>>>>> + if (LHS.first > RHS.first)
>>>>> + return false;
>>>>> +
>>>>> + // For ranges that start at the same address, keep the
>>>>> + // non-empty one.
>>>>> + if (LHS.second != -1 && RHS.second == -1)
>>>>> + return true;
>>>>> +
>>>>> + return false;
>>>>> + };
>>>>> + auto SubroutineAddrMapUniquer = [](const std::pair<int, int> &LHS,
>>>>> + const std::pair<int, int> &RHS) {
>>>>> + // If the start addresses are exactly the same, we can
>>>>> + // remove all but the first one as it is the only one that
>>>>> + // will be found and used.
>>>>> + //
>>>>> + // If the DIE indices are the same, we can "merge" the
>>>>> + // ranges by eliminating the second.
>>>>> + return LHS.first == RHS.first || LHS.second == RHS.second;
>>>>> + };
>>>>> +
>>>>> + struct DieAndParentIntervalRange {
>>>>> + DWARFDie Die;
>>>>> + int ParentIntervalsBeginIdx, ParentIntervalsEndIdx;
>>>>> + };
>>>>> +
>>>>> + SmallVector<DieAndParentIntervalRange, 16> Worklist;
>>>>> + auto EnqueueChildDIEs = [&](const DWARFDie &Die, int
>>>>> ParentIntervalsBeginIdx,
>>>>> + int ParentIntervalsEndIdx) {
>>>>> + for (DWARFDie Child : Die.children())
>>>>> + Worklist.push_back(
>>>>> + {Child, ParentIntervalsBeginIdx, ParentIntervalsEndIdx});
>>>>> + };
>>>>> + EnqueueChildDIEs(SPInfo.SubprogramDIE, 0, 0);
>>>>> + while (!Worklist.empty()) {
>>>>> + DWARFDie Die = Worklist.back().Die;
>>>>> + int ParentIntervalsBeginIdx =
>>>>> Worklist.back().ParentIntervalsBeginIdx;
>>>>> + int ParentIntervalsEndIdx = Worklist.back().ParentIntervalsEndIdx;
>>>>> + Worklist.pop_back();
>>>>> +
>>>>> + // If we encounter a nested subprogram, simply ignore it. We map
>>>>> to
>>>>> + // (disjoint) subprograms before arriving here and we don't want
>>>>> to examine
>>>>> + // any inlined subroutines of an unrelated subpragram.
>>>>> + if (Die.getTag() == DW_TAG_subprogram)
>>>>> + continue;
>>>>> +
>>>>> + // For non-subroutines, just recurse to keep searching for inlined
>>>>> + // subroutines.
>>>>> + if (Die.getTag() != DW_TAG_inlined_subroutine) {
>>>>> + EnqueueChildDIEs(Die, ParentIntervalsBeginIdx,
>>>>> ParentIntervalsEndIdx);
>>>>> + continue;
>>>>> + }
>>>>> +
>>>>> + // Capture the inlined subroutine DIE that we will reference from
>>>>> the map.
>>>>> + int DIEIndex = InlinedSubroutineDIEs.size();
>>>>> + InlinedSubroutineDIEs.push_back(Die);
>>>>> +
>>>>> + int DieIntervalsBeginIdx = AddrMap.size();
>>>>> + // First collect the PC ranges for this DIE into our subroutine
>>>>> interval
>>>>> + // map.
>>>>> + for (auto R : Die.getAddressRanges()) {
>>>>> + // Clamp the PCs to be above the base.
>>>>> + R.LowPC = std::max(R.LowPC, BasePC);
>>>>> + R.HighPC = std::max(R.HighPC, BasePC);
>>>>> + // Compute relative PCs from the subprogram base and drop down
>>>>> to an
>>>>> + // unsigned 32-bit int to represent them within the data
>>>>> structure. This
>>>>> + // lets us cover a 4gb single subprogram. Because subprograms
>>>>> may be
>>>>> + // partitioned into distant parts of a binary (think hot/cold
>>>>> + // partitioning) we want to preserve as much as we can here
>>>>> without
>>>>> + // burning extra memory. Past that, we will simply truncate and
>>>>> lose the
>>>>> + // ability to map those PCs to a DIE more precise than the
>>>>> subprogram.
>>>>> + const uint32_t MaxRelativePC =
>>>>> std::numeric_limits<uint32_t>::max();
>>>>> + uint32_t RelativeLowPC = (R.LowPC - BasePC) >
>>>>> (uint64_t)MaxRelativePC
>>>>> + ? MaxRelativePC
>>>>> + : (uint32_t)(R.LowPC - BasePC);
>>>>> + uint32_t RelativeHighPC = (R.HighPC - BasePC) >
>>>>> (uint64_t)MaxRelativePC
>>>>> + ? MaxRelativePC
>>>>> + : (uint32_t)(R.HighPC - BasePC);
>>>>> + // Ignore empty or bogus ranges.
>>>>> + if (RelativeLowPC >= RelativeHighPC)
>>>>> + continue;
>>>>> + AddrMap.push_back({RelativeLowPC, DIEIndex});
>>>>> + AddrMap.push_back({RelativeHighPC, -1});
>>>>> + }
>>>>> +
>>>>> + // If there are no address ranges, there is nothing to do to map
>>>>> into them
>>>>> + // and there cannot be any child subroutine DIEs with address
>>>>> ranges of
>>>>> + // interest as those would all be required to nest within this
>>>>> DIE's
>>>>> + // non-existent ranges, so we can immediately continue to the
>>>>> next DIE in
>>>>> + // the worklist.
>>>>> + if (DieIntervalsBeginIdx == (int)AddrMap.size())
>>>>> + continue;
>>>>> +
>>>>> + // The PCs from this DIE should never overlap, so we can easily
>>>>> sort them
>>>>> + // here.
>>>>> + std::sort(AddrMap.begin() + DieIntervalsBeginIdx, AddrMap.end(),
>>>>> + SubroutineAddrMapSorter);
>>>>> + // Remove any dead ranges. These should only come from "empty"
>>>>> ranges that
>>>>> + // were clobbered by some other range.
>>>>> + AddrMap.erase(std::unique(AddrMap.begin() + DieIntervalsBeginIdx,
>>>>> + AddrMap.end(),
>>>>> SubroutineAddrMapUniquer),
>>>>> + AddrMap.end());
>>>>> +
>>>>> + // Compute the end index of this DIE's addr map intervals.
>>>>> + int DieIntervalsEndIdx = AddrMap.size();
>>>>> +
>>>>> + assert(DieIntervalsBeginIdx != DieIntervalsEndIdx &&
>>>>> + "Must not have an empty map for this layer!");
>>>>> + assert(AddrMap.back().second == -1 && "Must end with an empty
>>>>> range!");
>>>>> + assert(std::is_sorted(AddrMap.begin() + DieIntervalsBeginIdx,
>>>>> AddrMap.end(),
>>>>> + less_first()) &&
>>>>> + "Failed to sort this DIE's interals!");
>>>>> +
>>>>> + // If we have any parent intervals, walk the newly added ranges
>>>>> and find
>>>>> + // the parent ranges they were inserted into. Both of these are
>>>>> sorted and
>>>>> + // neither has any overlaps. We need to append new ranges to
>>>>> split up any
>>>>> + // parent ranges these new ranges would overlap when we merge
>>>>> them.
>>>>> + if (ParentIntervalsBeginIdx != ParentIntervalsEndIdx) {
>>>>> + int ParentIntervalIdx = ParentIntervalsBeginIdx;
>>>>> + for (int i = DieIntervalsBeginIdx, e = DieIntervalsEndIdx - 1;
>>>>> i < e;
>>>>> + ++i) {
>>>>> + const uint32_t IntervalStart = AddrMap[i].first;
>>>>> + const uint32_t IntervalEnd = AddrMap[i + 1].first;
>>>>> + const int IntervalDieIdx = AddrMap[i].second;
>>>>> + if (IntervalDieIdx == -1) {
>>>>> + // For empty intervals, nothing is required. This is a bit
>>>>> surprising
>>>>> + // however. If the prior interval overlaps a parent
>>>>> interval and this
>>>>> + // would be necessary to mark the end, we will synthesize a
>>>>> new end
>>>>> + // that switches back to the parent DIE below. And this
>>>>> interval will
>>>>> + // get dropped in favor of one with a DIE attached.
>>>>> However, we'll
>>>>> + // still include this and so worst-case, it will still end
>>>>> the prior
>>>>> + // interval.
>>>>> + continue;
>>>>> + }
>>>>> +
>>>>> + // We are walking the new ranges in order, so search forward
>>>>> from the
>>>>> + // last point for a parent range that might overlap.
>>>>> + auto ParentIntervalsRange =
>>>>> + make_range(AddrMap.begin() + ParentIntervalIdx,
>>>>> + AddrMap.begin() + ParentIntervalsEndIdx);
>>>>> + assert(std::is_sorted(ParentIntervalsRange.begin(),
>>>>> + ParentIntervalsRange.end(),
>>>>> less_first()) &&
>>>>> + "Unsorted parent intervals can't be searched!");
>>>>> + auto PI = std::upper_bound(
>>>>> + ParentIntervalsRange.begin(), ParentIntervalsRange.end(),
>>>>> + IntervalStart,
>>>>> + [](uint32_t LHS, const std::pair<uint32_t, int32_t> &RHS)
>>>>> {
>>>>> + return LHS < RHS.first;
>>>>> + });
>>>>> + if (PI == ParentIntervalsRange.begin() ||
>>>>> + PI == ParentIntervalsRange.end())
>>>>> + continue;
>>>>> +
>>>>> + ParentIntervalIdx = PI - AddrMap.begin();
>>>>> + int32_t &ParentIntervalDieIdx = std::prev(PI)->second;
>>>>> + uint32_t &ParentIntervalStart = std::prev(PI)->first;
>>>>> + const uint32_t ParentIntervalEnd = PI->first;
>>>>> +
>>>>> + // If the new range starts exactly at the position of the
>>>>> parent range,
>>>>> + // we need to adjust the parent range. Note that these
>>>>> collisions can
>>>>> + // only happen with the original parent range because we will
>>>>> merge any
>>>>> + // adjacent ranges in the child.
>>>>> + if (IntervalStart == ParentIntervalStart) {
>>>>> + // If there will be a tail, just shift the start of the
>>>>> parent
>>>>> + // forward. Note that this cannot change the parent
>>>>> ordering.
>>>>> + if (IntervalEnd < ParentIntervalEnd) {
>>>>> + ParentIntervalStart = IntervalEnd;
>>>>> + continue;
>>>>> + }
>>>>> + // Otherwise, mark this as becoming empty so we'll remove
>>>>> it and
>>>>> + // prefer the child range.
>>>>> + ParentIntervalDieIdx = -1;
>>>>> + continue;
>>>>> + }
>>>>> +
>>>>> + // Finally, if the parent interval will need to remain as a
>>>>> prefix to
>>>>> + // this one, insert a new interval to cover any tail.
>>>>> + if (IntervalEnd < ParentIntervalEnd)
>>>>> + AddrMap.push_back({IntervalEnd, ParentIntervalDieIdx});
>>>>> }
>>>>> - AddrDieMap[R.LowPC] = std::make_pair(R.HighPC, Die);
>>>>> }
>>>>> +
>>>>> + // Note that we don't need to re-sort even this DIE's address map
>>>>> intervals
>>>>> + // after this. All of the newly added intervals actually fill in
>>>>> *gaps* in
>>>>> + // this DIE's address map, and we know that children won't need
>>>>> to lookup
>>>>> + // into those gaps.
>>>>> +
>>>>> + // Recurse through its children, giving them the interval map
>>>>> range of this
>>>>> + // DIE to use as their parent intervals.
>>>>> + EnqueueChildDIEs(Die, DieIntervalsBeginIdx, DieIntervalsEndIdx);
>>>>> }
>>>>> - // Parent DIEs are added to the AddrDieMap prior to the Children
>>>>> DIEs to
>>>>> - // simplify the logic to update AddrDieMap. The child's range will
>>>>> always
>>>>> - // be equal or smaller than the parent's range. With this
>>>>> assumption, when
>>>>> - // adding one range into the map, it will at most split a range
>>>>> into 3
>>>>> - // sub-ranges.
>>>>> - for (DWARFDie Child = Die.getFirstChild(); Child; Child =
>>>>> Child.getSibling())
>>>>> - updateAddressDieMap(Child);
>>>>> +
>>>>> + if (AddrMap.empty()) {
>>>>> + AddrMap.push_back({0, -1});
>>>>> + return;
>>>>> + }
>>>>> +
>>>>> + // Now that we've added all of the intervals needed, we need to
>>>>> resort and
>>>>> + // unique them. Most notably, this will remove all the empty ranges
>>>>> that had
>>>>> + // a parent range covering, etc. We only expect a single non-empty
>>>>> interval
>>>>> + // at any given start point, so we just use std::sort. This could
>>>>> potentially
>>>>> + // produce non-deterministic maps for invalid DWARF.
>>>>> + std::sort(AddrMap.begin(), AddrMap.end(), SubroutineAddrMapSorter);
>>>>> + AddrMap.erase(
>>>>> + std::unique(AddrMap.begin(), AddrMap.end(),
>>>>> SubroutineAddrMapUniquer),
>>>>> + AddrMap.end());
>>>>> }
>>>>>
>>>>> DWARFDie DWARFUnit::getSubroutineForAddress(uint64_t Address) {
>>>>> extractDIEsIfNeeded(false);
>>>>> - if (AddrDieMap.empty())
>>>>> - updateAddressDieMap(getUnitDIE());
>>>>> - auto R = AddrDieMap.upper_bound(Address);
>>>>> - if (R == AddrDieMap.begin())
>>>>> +
>>>>> + // We use a two-level mapping structure to locate subroutines for a
>>>>> given PC
>>>>> + // address.
>>>>> + //
>>>>> + // First, we map the address to a subprogram. This can be done more
>>>>> cheaply
>>>>> + // because subprograms cannot nest within each other. It also
>>>>> allows us to
>>>>> + // avoid detailed examination of many subprograms, instead only
>>>>> focusing on
>>>>> + // the ones which we end up actively querying.
>>>>> + if (SubprogramDIEAddrMap.empty())
>>>>> + buildSubprogramDIEAddrMap();
>>>>> +
>>>>> + assert(!SubprogramDIEAddrMap.empty() &&
>>>>> + "We must always end up with a non-empty map!");
>>>>> +
>>>>> + auto I = std::upper_bound(
>>>>> + SubprogramDIEAddrMap.begin(), SubprogramDIEAddrMap.end(),
>>>>> Address,
>>>>> + [](uint64_t LHS, const std::pair<uint64_t, int64_t> &RHS) {
>>>>> + return LHS < RHS.first;
>>>>> + });
>>>>> + // If we find the beginning, then the address is before the first
>>>>> subprogram.
>>>>> + if (I == SubprogramDIEAddrMap.begin())
>>>>> return DWARFDie();
>>>>> - // upper_bound's previous item contains Address.
>>>>> - --R;
>>>>> - if (Address >= R->second.first)
>>>>> + // Back up to the interval containing the address and see if it
>>>>> + // has a DIE associated with it.
>>>>> + --I;
>>>>> + if (I->second == -1)
>>>>> return DWARFDie();
>>>>> - return R->second.second;
>>>>> +
>>>>> + auto &SPInfo = SubprogramDIEAddrInfos[I->second];
>>>>> +
>>>>> + // Now that we have the subprogram for this address, we do the
>>>>> second level
>>>>> + // mapping by building a map within a subprogram's PC range to any
>>>>> specific
>>>>> + // inlined subroutine.
>>>>> + if (SPInfo.InlinedSubroutineDIEAddrMap.empty())
>>>>> + buildInlinedSubroutineDIEAddrMap(SPInfo);
>>>>> +
>>>>> + // We lookup within the inlined subroutine using a
>>>>> subprogram-relative
>>>>> + // address.
>>>>> + assert(Address >= SPInfo.SubprogramBasePC &&
>>>>> + "Address isn't above the start of the subprogram!");
>>>>> + uint32_t RelativeAddr = ((Address - SPInfo.SubprogramBasePC) >
>>>>> +
>>>>> (uint64_t)std::numeric_limits<uint32_t>::max())
>>>>> + ? std::numeric_limits<uint32_t>::max()
>>>>> + : (uint32_t)(Address -
>>>>> SPInfo.SubprogramBasePC);
>>>>> +
>>>>> + auto J =
>>>>> + std::upper_bound(SPInfo.InlinedSubroutineDIEAddrMap.begin(),
>>>>> + SPInfo.InlinedSubroutineDIEAddrMap.end(),
>>>>> RelativeAddr,
>>>>> + [](uint32_t LHS, const std::pair<uint32_t,
>>>>> int32_t> &RHS) {
>>>>> + return LHS < RHS.first;
>>>>> + });
>>>>> + // If we find the beginning, the address is before any inlined
>>>>> subroutine so
>>>>> + // return the subprogram DIE.
>>>>> + if (J == SPInfo.InlinedSubroutineDIEAddrMap.begin())
>>>>> + return SPInfo.SubprogramDIE;
>>>>> + // Back up `J` and return the inlined subroutine if we have one or
>>>>> the
>>>>> + // subprogram if we don't.
>>>>> + --J;
>>>>> + return J->second == -1 ? SPInfo.SubprogramDIE
>>>>> + : InlinedSubroutineDIEs[J->second];
>>>>> }
>>>>>
>>>>> void
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180301/4e2bf18a/attachment.html>
More information about the llvm-commits
mailing list