[PATCH] D136395: Add the ability to verify the .debug_aranges section.

Greg Clayton via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 24 17:06:59 PDT 2022


clayborg added inline comments.


================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp:991
+      continue;
+    if (DWARFRange.LowPC < DWARFRange.HighPC)
+      Ranges.insert(AddressRange(DWARFRange.LowPC, DWARFRange.HighPC));
----------------
dblaikie wrote:
> This is to account for overflow?
And to catch ranges that have been zeroed out or set to the same address to effectively dead strip the function. So this is mostly the catch LowPC == HighPC, but it can catch overflow issues as well.


================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp:1043-1046
+      // Make sure all .debug_aranges descriptor ranges are found in the
+      // compile unit's ranges.
+      uint32_t DescIdx = 0;
+      for (const auto &Descriptor: ArangeSet.descriptors()) {
----------------
dblaikie wrote:
> Pretty sure this won't be true for clang/llvm's debug info - LLVM includes global variable addresses in aranges, but not in CU ranges. We've had various discussions about whether this is correct/useful, and so far people seem to think it is correct and maybe useful? I don't really know. GCC doesn't put global variable addresses in either aranges or cu ranges.
> 
> (I'm pretty sure it's technically correct, but maybe not all that useful - at least if GCC doesn't do it (pretty sure it's generally correct that CU ranges shouldn't include global variable addresses, at least - so just a question of what should/shouldn't be in aranges))
> Pretty sure this won't be true for clang/llvm's debug info - LLVM includes global variable addresses in aranges, but not in CU ranges. We've had various discussions about whether this is correct/useful, and so far people seem to think it is correct and maybe useful? I don't really know. GCC doesn't put global variable addresses in either aranges or cu ranges.

Interesting. We should be including it all address ranges for functions and globals in both .debug_aranges and the DW_AT_ranges for the compile unit, or not for either. Otherwise we are making a case for .debug_aranges to continue to exist. 

> (I'm pretty sure it's technically correct, but maybe not all that useful - at least if GCC doesn't do it (pretty sure it's generally correct that CU ranges shouldn't include global variable addresses, at least - so just a question of what should/shouldn't be in aranges))

I would vote to include globals in both, but with the hopes to move to DW_AT_ranges and getting rid of .debug_aranges. Since it hasn't been clarified, maybe we can leave this in for now to see if we can detect this issue with GCC? No one is using llvm-dwarfdump's verify result as something that stops a compilation or build that I am aware of, so it would be good to see this inconsistency IMHO.


================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp:1078-1088
+      // Now make sure that all CU ranges are in the .debug_aranges set.
+      for (const auto &Range: SortedCURanges) {
+        if (!SortedAranges.contains(Range)) {
+          errorHeader(error());
+          OS << " compile unit range ["
+              << format_hex(Range.start(), AddrPrintSize)
+              << " - " << format_hex(Range.end(), AddrPrintSize)
----------------
dblaikie wrote:
> Rather than scanning in both directions (does everything in aranges appear in CU ranges, then does everything in CU ranges appear in aranges) - maybe sort both and do a single scan through & just flag the first place they're inconsistent? (or, if accounting for the "aranges has extra stuff for global variables", skip over those entries if you can until you pick up in the CU ranges - if you get to the end without having visited all the CU ranges then there's something wrong at least)
> Rather than scanning in both directions (does everything in aranges appear in CU ranges, then does everything in CU ranges appear in aranges) - maybe sort both and do a single scan through & just flag the first place they're inconsistent? 

I would personally prefer to see all ranges that are missing to indicate how many times we have issues in the .debug_aranges. If we just flag the first inconsistency then we only know that there was at least one error. The reason I like seeing all of these issues is when thinks fail in a debugger, like "I was looking up address 0x123456 in this binary and it failed", it is nice to see an error in the "llvm-dwarfdump --verify" output that lists the range as having an error so we know this can be the reason it failed. Similar issues with anyone doing symbolication would exist as well.

> (or, if accounting for the "aranges has extra stuff for global variables", skip over those entries if you can until you pick up in the CU ranges - if you get to the end without having visited all the CU ranges then there's something wrong at least)

I am thinking of solutions for the global variable problem. One idea is to try and extract all ranges for any data sections (read only and read + write), and then if we don't find a .debug_aranges range in the DW_AT_ranges, then we check if that address is in the DataRanges and if so, don't produce an error? Do we want to warn?

I am not opposed to seeing the full inconsistency though as if something is in .debug_aranges, it should really also been DW_AT_ranges IMHO. Do you agree? If we already have something that differs where global vars are in .debug_aranges but not in DW_AT_ranges, this should be fixed somehow if we want to get rid of .debug_aranges.




================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp:1093-1130
+      const DWARFDebugLine::LineTable *LT = DCtx.getLineTableForUnit(CU);
+      if (LT) {
+        for (const auto &Sequence: LT->Sequences) {
+          // If the first address in a sequence is invalid, don't verify it as
+          // it should have been dead stripped.
+          const DWARFDebugLine::Row &Row = LT->Rows[Sequence.FirstRowIndex];
+          if (DWARFAddressIsInvalid(Row.Address.Address, SetHdr.AddrSize))
----------------
dblaikie wrote:
> Do we already have something like this for CU ranges? Could we reuse it?
We don't currently. It would be easy to move this elsewhere so we can use it when checking .debug_line sections where that code would verify against either the DW_TAG_compile_unit's DW_AT_ranges, or in this case maybe it would only check against the .debug_aranges ranges. The hard part there is determining which addresses have the first address in each sequence being invalid as most dead stripping will just zeroing out the start address of a function. So we end up with tons of overlapping address ranges at zero all the time and this info really shouldn't be checked.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136395/new/

https://reviews.llvm.org/D136395



More information about the llvm-commits mailing list