[PATCH] D32821: Add DWARF verifiers to verify address ranges are correct and scoped correctly.

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun May 14 11:23:00 PDT 2017


dblaikie added a comment.

Looking pretty good - thanks for all your patience!

A few places this could be further tidied up/streamlined - and if you can do anything to pare down the unit tests based on equivalence classes/partitions, that'd be great, it looks like more tests/cases than I'd expect, given the code under test.



================
Comment at: include/llvm/DebugInfo/DWARF/DWARFVerifier.h:48
+    }
+    bool operator<(const DWARFRange &RHS) const {
+      if (Start == RHS.Start)
----------------
Usually any operator overload that can be a non-member should be (it can be a friend, though in this instance even that's not necessary)


================
Comment at: include/llvm/DebugInfo/DWARF/DWARFVerifier.h:49-51
+      if (Start == RHS.Start)
+        return End < RHS.End;
+      return Start < RHS.Start;
----------------
One quick way to implement this is:

  return std::tie(Start, End) < std::tie(RHS.Start, RHS.End);


================
Comment at: include/llvm/DebugInfo/DWARF/DWARFVerifier.h:83
+  struct NonOverlappingRanges {
+    std::set<DieRangeInfo> RangeSet;
+
----------------
Does llvm's IntervalMap make sense here? Keeping an IntervalMap from the address range to the DIE that occupies that range seems good & less code (no need to write a custom op<, etc) I think... probably.


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:37-51
+      bool RangeOverlaps = false;
+      if (InsertPos != End) {
+        if (InsertPos->intersects(R))
+          RangeOverlaps = true;
+        else if (InsertPos != Begin) {
+          auto Iter = InsertPos - 1;
+          if (Iter->intersects(R))
----------------
Wouldn't the presence of overlapping ranges in the Ranges list mess with the correctness of other algorithms (like intersection) - you might then have to search arbitrarily far back in the range list from the lower_bound because an earlier range might be longer and extend past the current one?

This seems unreliable and may be better to error on an invalid range and then not process it further? (not produce follow-on errors for intersections, etc)? (also may be better in general rather than producing a bunch of related errors about an invalid range - errors about it overlapping with many things, being outside the parent raneg, etc)

Not sure.


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:62-72
+  size_t I = 0;
+  while (I < UnsortedRanges.size()) {
+    if (UnsortedRanges[I].first >= UnsortedRanges[I].second) {
+      if (UnsortedRanges[I].first > UnsortedRanges[I].second)
+        HasErrors = true;
+      // Remove empty or invalid ranges.
+      UnsortedRanges.erase(UnsortedRanges.begin() + I);
----------------
Seems like a job for erase/remove, maybe:

  UnsortedRanges.erase(llvm:::remove_if(UnsortedRanges, [&](const std::pair<uint64_t, uint64_t> &R) { if (R.first < R.second) return false; if (R.first > R.second) HasErrors = true; return false; } , UnsortedRanges.end());

(appropriately indented, etc)


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:124-134
+  for (const auto &R : RHS.Ranges) {
+    auto Iter = std::lower_bound(SearchBegin, End, R);
+    if (Iter != End && Iter->intersects(R))
+      return true;
+    if (Iter != Begin) {
+      auto Prev = Iter - 1;
+      if (Prev->intersects(R))
----------------
This and contains still seems more expensive (O(N log(N))) than I'd expect - it should be doable without too much complexity, linearly - by walking both sorted lists at the same time.

Ah, I see, by moving the starting point forward it's a smaller search space - but still not as efficient, I think, as searching linearly from the start of that range, so maybe replacing lower_bound with std::find with a comparator of some kind?

Similar idea for 'contains' above.


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:139-153
+  if (RHS.Ranges.empty())
+    return false;
+  if (Ranges.empty())
+    return true;
+  size_t LeftSize = Ranges.size();
+  size_t RightSize = RHS.Ranges.size();
+  for (size_t I = 0; I < LeftSize; ++I) {
----------------
Can probably implement this with:

  return std::tie(Ranges, Die.getOffset()) < std::tie(RHS.Ranges, RHS.Die.getOffset());

Also - does this need to support < comparison of two entities with exactly the same ranges but a different DIE? When/where does that come up/what behavior is desirable there?


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:158-165
+  if (!contains(RI)) {
+    OS << Error;
+    Die.dump(OS, 0);
+    RI.Die.dump(OS, 0);
+    OS << "\n";
+    return true;
+  }
----------------
Probably invert this to reduce indentation:

  if (contains(RI))
    return false;
  OS << Error;
  ...


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:171-173
+  if (DieRI.extractRangesAndReportErrors(OS)) {
+    HasErrors = true;
+  }
----------------
Drop braces on single line ifs, and probably change this code to:

  bool HasErrors = DieRI.extractRangesAndReportErrors(OS);


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:178-180
+  // Die has address ranges so we need to check that its address ranges are
+  // contained in this DieInfo's address ranges except if this is a compile
+  // unit that doesn't have address range(s).
----------------
Why is the CU a special case here? (a subprogram can have ranges in a CU with no ranges, but a lexical scope can't have ranges in a subprogram without ranges - is that the condition? I'm not sure why that's the case - could you explain it?)


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:181-188
+  bool IsCompileUnit = RI.Die.getTag() == DW_TAG_compile_unit;
+  bool CheckContains = !IsCompileUnit || !RI.Ranges.empty();
+
+  const char *Error = IsCompileUnit
+                          ? "error: CU DIE has child with address ranges that "
+                            "are not contained in its ranges:\n"
+                          : "error: DIE has a child DIE whose address ranges "
----------------
Why is there a special case for CU DIEs here? I'd expect the same error message and behavior regardless of whether it's a CU or not. The DIEs should be dumped and that'll make it clear to the user what kind of DIE it is without needing a different error message.


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:202
+    const DieRangeInfo &RI) const {
+  if (!RangeSet.empty()) {
+    auto Iter = RangeSet.lower_bound(RI);
----------------
Invert to reduce indentation:

  if (RangeSet.empty())
    return nullptr;


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:203-210
+    auto Iter = RangeSet.lower_bound(RI);
+    if (Iter != RangeSet.end()) {
+      if (Iter->intersects(RI))
+        return &*Iter;
+    }
+    if (Iter != RangeSet.begin()) {
+      --Iter;
----------------
Might be worth a couple of comments explaining the conditions these two cases cover.


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:204-205
+    auto Iter = RangeSet.lower_bound(RI);
+    if (Iter != RangeSet.end()) {
+      if (Iter->intersects(RI))
+        return &*Iter;
----------------
Could collapse these together:

  if (Iter != RangeSet.end() && Iter->intersects(RI))


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:208-210
+    if (Iter != RangeSet.begin()) {
+      --Iter;
+      if (Iter->intersects(RI))
----------------
Could collapse these two together:

 if (Iter != RangeSet.begin() && (--Iter)->intersects(RI))

Though maybe that doesn't help readability - not sure.


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:220-223
+  auto OverlappingRI = GetOverlappingRangeInfo(RI);
+  if (OverlappingRI) {
+    OS << "error: DIEs have overlapping address ranges:\n";
+    OverlappingRI->Die.dump(OS, 0);
----------------
Probably makes sense for "GetOverlappingRangeInfo" to return just the DWARFDie (assuming it has a null/empty state that's boolean testable) rather than something more complicated - maybe simpler name too "GetOverlappingDIE" I guess.

& you can collapse the declaration:

  if (DWARFDie Die = GetOverlappingDie(RI)) {
    ...


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:228
+  }
+  RangeSet.insert(RI);
+  return Error;
----------------
If you use an IntervalMap, might be worth seeing if you could do an insert+retrieve previous value (but that's probably too niche - since an insert into the IntervalMap could cover multiple previous values). This would avoid GetOverlappingRangeInfo doing similar/the same work as 'insert'.


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:241-245
+    // Set the Unit DIE info.
+    UnitDI.RI.Die = Die;
+    if (UnitDI.RI.extractRangesAndReportErrors(OS))
+      ++NumDebugInfoErrors;
+    break;
----------------
Where does the cross-CU checking happen (ensuring no two CUs ranges overlap). I'd expect it to happen here, but don't see it?


================
Comment at: unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp:2502
+
+  ASSERT_FALSE(Ranges.contains(DWARFVerifier::DieRangeInfo({{0x0f, 0x10}})));
+  ASSERT_FALSE(Ranges.contains(DWARFVerifier::DieRangeInfo({{0x20, 0x30}})));
----------------
/might/ be able to simplify these lines by omitting the "DWARFVerifier::DIERangeInfo(" ")" part (& just leaving the {} init) - you might need to add an extra set of {} around it instead, I'm not sure.


https://reviews.llvm.org/D32821





More information about the llvm-commits mailing list