[PATCH] D107554: [DWARF][Verifier] Do not add child DieRangeInfo with empty address range to the parent.

Fri Aug 6 16:14:33 PDT 2021

dblaikie added a comment.

In D107554#2931226 <https://reviews.llvm.org/D107554#2931226>, @avl wrote:

>> Ah *nod* wouldn't be covered here. Perhaps a more general fix could be made to the way DieRangeInfo does its work? Specifically DieRangeInfo::intersects(DieRangeInfo) - perhaps it could favor whichever has the smaller list? (currently it iterates all the LHS and checks them against the RHS - but usually, especially for this use in the verifier, the LHS will be much larger (containing the aggregate of all the ranges seen so far) than the RHS - I'd hazard that even just always flipping it (always iterating RHS to check for presence in LHS) would be faster, but to be general/symmetric, perhaps checking which is smaller and iterating that would be best). That would let the empty case fall out naturally, and hopefully make the several zero-length ranges (like linker-gc'd functions) better too, hopefully?
>
> I think I did not get the idea. DieRangeInfo::intersects always compares ranges pair by pair. It looks like no sense to change iterating from RHS to LHS:
>
>   bool DWARFVerifier::DieRangeInfo::intersects(const DieRangeInfo &RHS) const {
>     auto I1 = Ranges.begin(), E1 = Ranges.end();
>     auto I2 = RHS.Ranges.begin(), E2 = RHS.Ranges.end();
>     while (I1 != E1 && I2 != E2) {
>       if (I1->intersects(*I2))
>         return true;
>       if (I1->LowPC < I2->LowPC)
>         ++I1;
>       else
>         ++I2;
>     }
>     return false;
>   }

Hmm - this is a new proposal ^ (it's not how the code's currently written)? Not sure I follow what this suggestion would provide.

> Also, it looks like there is no problem with ranges of zero length. The problem is when there are no ranges at all.
> In that case, we do not need to add DieRangeInfo RI into the Children list, to prevent redundant checks and navigation of a lot of empty structures:
>
>   DWARFVerifier::DieRangeInfo::die_range_info_iterator
>   DWARFVerifier::DieRangeInfo::insert(const DieRangeInfo &RI) {
>     auto End = Children.end();
>     auto Iter = Children.begin();
>     while (Iter != End) {
>       if (Iter->intersects(RI)) <<<<<<<<<<<<<<<<<<<<< redundant check if *Iter does not have a range
>         return Iter;
>       ++Iter;   <<<<<<<<<<<<<<<<<<<< redundant navigation  if *Iter does not have a range
>     }
>     Children.insert(RI);
>     return Children.end();
>   }
>
> Probably, I misunderstood your suggestion.

Sorry, yeah, I'm still confused. By "RHS' and "LHS" I didn't mean the start/end of each pair, but the two DieRangeInfos - LHS == this, RHS == the 'RHS' parameter to the function.

I mean having insert look something like this:

  bool DWARFVerifier::DieRangeInfo::intersects(const DieRangeInfo &OriginalRHS) const {
    bool Swap = Ranges.size() > RHS.Ranges.size();
    auto &LHS = Swap ? OriginalRHS : *this;
    auto &RHS = Swap ? *this : OriginalRHS;
    auto End = LHS.end();
    auto Iter = LHS.begin();
    while (Iter != End) {
      bool Intersects;
      iterator IntersectingIterator;
      std::tie(Intersects, IntersectingIterator) = Iter->intersects(RHS); // imagine a DieRangeInfo::intersects that returns both the bool, but also an iterator about where the intersection was
      if (Intersects)
        return Swap ? IntersectingIterator : Iter;
      ++Iter;
    }
    Children.insert(OriginalRHS);
    return Children.end();
  }

This way, not only does the empty case perform better (if either LHS or RHS is empty, the algorithm will be quick - because the while loop would never run) - but also non-empty cases would be more efficient, I think, maybe?

Though honestly - maybe a broader change in direction would be better: What if the DieRangeInfos were sorted (I guess they are, since they're in a std::set?)? Then it wouldn't be an exhaustive search, but use find to `find` the relevant part of the list to look at instead. Hmm, maybe sorted in a contiguous data structure (like std::vector) and then binary searching the flat structure - and also then being able to binary search only the remainder of the list when looking for the next entry? Not sure, though.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107554/new/

https://reviews.llvm.org/D107554