[PATCH] D32821: Add DWARF verifiers to verify address ranges are correct and scoped correctly.

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 12 11:05:33 PDT 2017


dblaikie added a comment.

I can work up a prototype/example of what I'm suggesting, if you like - but here are some comments to give the basic idea. It still seems to me like there are more parts to this than necessary.



================
Comment at: include/llvm/DebugInfo/DWARF/DWARFDebugRangeList.h:25-40
+typedef std::pair<uint64_t, uint64_t> DWARFRange;
+
+/// Returns true if [LHS.first, LHS.second) intersects with
+/// [RHS.first, RHS.second).
+inline bool DWARFRangesIntersect(const DWARFRange &LHS, const DWARFRange &RHS) {
+  if (LHS.first == LHS.second || RHS.first == RHS.second)
+    return false; // Empty ranges can't intersect.
----------------
Might be tidier to make this a struct with a couple of functions:

  struct DWARFRange {
    uint64_t Start;
    uint64_t End;
    bool intersects(const DWARFRange&);
    bool contains(const DWARFRange&);
  }

That way the function names don't have to repeat the "DWARFRange" prefix.


================
Comment at: include/llvm/DebugInfo/DWARF/DWARFDebugRangeList.h:43
 /// DWARFAddressRangesVector - represents a set of absolute address ranges.
-typedef std::vector<std::pair<uint64_t, uint64_t>> DWARFAddressRangesVector;
+typedef std::vector<DWARFRange> DWARFAddressRangesVector;
 
----------------
I don't think this typedef adds a lot now - the original name is shorter & more descriptive (since it makes it clear it's just a plain old std::vector, not anything fancier)


================
Comment at: include/llvm/DebugInfo/DWARF/DWARFDebugRangeList.h:46-63
+inline bool AnyDWARFRangesIntersect(const DWARFAddressRangesVector &LHS,
+                                    const DWARFAddressRangesVector &RHS) {
+  for (const auto &R : LHS)
+    for (const auto &L : RHS)
+      if (DWARFRangesIntersect(R, L))
+        return true;
+  return false;
----------------
Surprised to see these O(N^2) algorithms still here - if the ranges are always sorted first, the pairwise-style algorithm you implemented for overlap within a single range should be usable on these cases I would think. A bit trickier to walk the two lists together, but do-able.

I wonder if LLVM's IntervalMap would be usable here (maps an integer range to something - in this case to a DWARFDie). If it provides an error message or a failable insert - you could walk each element of a range, adding them to the IntervalMap - if they ever collide, error out. Otherwise you have a compact/efficient representation of the ranges you can quickly look up/insert into... 


================
Comment at: include/llvm/DebugInfo/DWARF/DWARFVerifier.h:30-46
+  class SortedRanges {
+    DWARFAddressRangesVector Ranges;
+
+  public:
+    bool contains(const SortedRanges &RHS) const;
+    bool doesIntersect(const SortedRanges &RHS) const;
+    void dump(raw_ostream &OS) const;
----------------
Does this class need to exist? I'm not sure it adds much of an improved abstraction boundary compared to being an implementation detail of DieRangeInfo?


================
Comment at: include/llvm/DebugInfo/DWARF/DWARFVerifier.h:48-74
+  struct DieRangeInfo {
+    DWARFDie Die;
+    SortedRanges Ranges;
+
+    DieRangeInfo() : Die(), Ranges() {}
+    DieRangeInfo(DWARFDie D, const DWARFAddressRangesVector &R);
+    /// Return true if this object contains all ranges within RHS.
----------------
I'm surprised there's a need for both DieRangeInfo and NonOverlappingRanges - it seems like DieRangeInfo should/could do all the non-overlapping testing too, no?


================
Comment at: include/llvm/DebugInfo/DWARF/DWARFVerifier.h:52
+
+    DieRangeInfo() : Die(), Ranges() {}
+    DieRangeInfo(DWARFDie D, const DWARFAddressRangesVector &R);
----------------
I think the "Die(), Ranges()" is redundant/unnecessary & probably best to write this as:

  DieRangeInfo() = default;


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:75-103
+bool DWARFVerifier::SortedRanges::insert(
+    const DWARFAddressRangesVector &UnsortedRanges) {
+  bool Error = false;
+  for (const auto &R : UnsortedRanges) {
+    if (R.first >= R.second)
+      continue;
+    auto Begin = Ranges.begin();
----------------
Ah, here's the other version of "intersects" - it's an "insert and test intersection". I'd expect to only need this version - not the other one that does the O(N^2) set of comparisons. Is that practical? (to only have the one version)

I also wouldn't be terribly sad if ranges were sorted first - that way they'd be quick to compare with the parent range and with the sibling ranges (O(N Log(N)) to sort, then two O(N) walks - one for the parent and one for the siblings. Rather than two O(N log(N)) walks - I mean it's still O(N log(N)) in the end... I don't mind much either way)


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:114-117
+  if (Ranges.insert(UnsortedRanges)) {
+    OS << "error: ranges in DIE overlap:\n";
+    HasErrors = true;
+  }
----------------
This diagnostic won't include both the DIEs that overlap, which seems difficult for the user. (test case should test this - that both the DIEs are dumped for cases where an error is about two DIEs)


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:201-217
+    if (RI.setDieAndReportRangeErrors(OS, Die))
+      ++NumDebugInfoErrors;
+    if (RI.Ranges.empty())
+      break;
+
+    // If the compile unit has address range(s), it must contain DIE's ranges.
+    const char *Error = "error: CU DIE has child with address ranges that are "
----------------
Why are there 3 different things a subprogram is being added to/processed by/errored by?

I'd have expected only one - the outer/current CU. It'd also need to initialize the current SP (which would have to be a stack of some kind, I guess (like the patch I provided last week), to handle the nested SP definition case)  for lexical scopes to compare against/be added to.


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:207-208
+    // If the compile unit has address range(s), it must contain DIE's ranges.
+    const char *Error = "error: CU DIE has child with address ranges that are "
+                        "not contained in its ranges:\n";
+    if (!UnitRI.Ranges.empty() &&
----------------
I don't think the CU DIE needs a special case for the diagnostic, does it? I would've expected the diagnostics to be in the RangeInfo & not need to be customized per caller.


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:209-212
+    if (!UnitRI.Ranges.empty() &&
+        UnitRI.reportErrorIfNotContained(OS, RI, Error)) {
+      ++NumDebugInfoErrors;
+    }
----------------
The non-empty check I would guess isn't needed here ("reportErrorIfNotContained" seems like it'd be trivially true if the range is empty - it couldn't possibly be not contained in anything)


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:214-215
+
+    // Check if this function's range overlaps with any previous function ranges
+    // from any compile unit and make an error if needed.
+    if (AllFunctionRanges.insertAndReportErrors(OS, RI))
----------------
Why would all functions need to be compared? I would expect that functions would be compared to their enclosing CU - if all CUs are not overlapping and all functions are contained within their CU's range then no function could be overlapping (in the same way that if all lexical scopes are contained within their parent, and their parents are non-overlapping (eg: different, non-overlapping functions) then such a pair of lexical scopes couldn't overlap)


================
Comment at: lib/DebugInfo/DWARF/DWARFVerifier.cpp:222-238
+    if (RI.setDieAndReportRangeErrors(OS, Die)) {
+      ++NumDebugInfoErrors;
+      Die.dump(OS, 0);
+      OS << "\n";
+    }
+    if (RI.Ranges.empty())
+      break;
----------------
Also here, I wouldn't expect 3 different cases (RI, ParentRI, and NOR) - I'd expect one, maybe two but hopefully abstracted into a single operation I think - "to the current relevant scope add this DIE - which will retrieve its range, check it for local validity, check it doesn't overlap with any siblings in the current scope and is contained in the parent".


https://reviews.llvm.org/D32821





More information about the llvm-commits mailing list