[PATCH] D136395: Add the ability to verify the .debug_aranges section.

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 1 10:21:26 PDT 2022


dblaikie added a comment.

In D136395#3897465 <https://reviews.llvm.org/D136395#3897465>, @ayermolo wrote:

> OK, Got to the bottom of this.
> To refresh memory this is for these two types of verify errors:
> error: .debug_aranges[0x00000000] compile unit range [0x0000000000000600 - 0x000000000000063f) not in .debug_aranges.
> error: .debug_aranges[0x00000000] compile unit @ 0x00000000 line table sequence [0-5) has row[0] with address 0x0000000000000600 that is not in .debug_aranges.
>
> The binary was build with -g1 and debug info profiling is not enabled.
> When machine function is processed things diverge here:
> https://github.com/llvm/llvm-project/blob/a3a9fffea1bfd14ea509007cbf4f9fdb4d602c7c/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L2223
>
> For ones that have Abstract Scopes it fall through creates a DIE and aranges entry. For others it just adds it to the ranges in CU.
>
> So should it always add to aranges?

Possibly - further discussion on the fix over on the review.

In D136395#3898080 <https://reviews.llvm.org/D136395#3898080>, @ayermolo wrote:

> What about this fix?
> https://reviews.llvm.org/D137139

Maybe - discussion over there.



================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp:1078-1088
+      // Now make sure that all CU ranges are in the .debug_aranges set.
+      for (const auto &Range: SortedCURanges) {
+        if (!SortedAranges.contains(Range)) {
+          errorHeader(error());
+          OS << " compile unit range ["
+              << format_hex(Range.start(), AddrPrintSize)
+              << " - " << format_hex(Range.end(), AddrPrintSize)
----------------
dblaikie wrote:
> clayborg wrote:
> > dblaikie wrote:
> > > Rather than scanning in both directions (does everything in aranges appear in CU ranges, then does everything in CU ranges appear in aranges) - maybe sort both and do a single scan through & just flag the first place they're inconsistent? (or, if accounting for the "aranges has extra stuff for global variables", skip over those entries if you can until you pick up in the CU ranges - if you get to the end without having visited all the CU ranges then there's something wrong at least)
> > > Rather than scanning in both directions (does everything in aranges appear in CU ranges, then does everything in CU ranges appear in aranges) - maybe sort both and do a single scan through & just flag the first place they're inconsistent? 
> > 
> > I would personally prefer to see all ranges that are missing to indicate how many times we have issues in the .debug_aranges. If we just flag the first inconsistency then we only know that there was at least one error. The reason I like seeing all of these issues is when thinks fail in a debugger, like "I was looking up address 0x123456 in this binary and it failed", it is nice to see an error in the "llvm-dwarfdump --verify" output that lists the range as having an error so we know this can be the reason it failed. Similar issues with anyone doing symbolication would exist as well.
> > 
> > > (or, if accounting for the "aranges has extra stuff for global variables", skip over those entries if you can until you pick up in the CU ranges - if you get to the end without having visited all the CU ranges then there's something wrong at least)
> > 
> > I am thinking of solutions for the global variable problem. One idea is to try and extract all ranges for any data sections (read only and read + write), and then if we don't find a .debug_aranges range in the DW_AT_ranges, then we check if that address is in the DataRanges and if so, don't produce an error? Do we want to warn?
> > 
> > I am not opposed to seeing the full inconsistency though as if something is in .debug_aranges, it should really also been DW_AT_ranges IMHO. Do you agree? If we already have something that differs where global vars are in .debug_aranges but not in DW_AT_ranges, this should be fixed somehow if we want to get rid of .debug_aranges.
> > 
> > 
> > I would personally prefer to see all ranges that are missing to indicate how many times we have issues in the .debug_aranges. 
> 
> Fair enough - could probably still be achieved in a single scan if both lists are sorted first? Do it like a merge deduplication - walk each list so long as its less than the next entry in the other list - skip over any entries that are earlier than the other list, then report all those as missing once you find the next matching entry?
> 
> > I am not opposed to seeing the full inconsistency though as if something is in .debug_aranges, it should really also been DW_AT_ranges IMHO. Do you agree?
> 
> Nah, I don't think that's something to warn or error on - the spec seems pretty clear that aranges should include data and code addresses and high_pc/low_pc/ranges should only include code addresses.
> 
> So, yeah, if you want to implement this on-spec, you'd need to search through all the DIEs looking for code addresses (which is non-trivial, since you have to go look in global variable exprlocs and go scrumping through the operations to pick the address operations out of them - and determining the length you wan tto use to verify against aranges would be non-trivial)
> 
> It's complexity like this that is why I'd rather not do any of this work - flag aranges as deprecated and move on.
(oh, and in something like -g1, you wouldn't find any global variable DIEs - not sure if they currently appear in aranges, but if they do, then there's no way to say "there's something in aranges that's missing from the DIEs" because they could all be global variables - but it's probably OK if we wanted to fix that by removing global variable aranges entries in -g1 since there's no info/DIEs about them anyway)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136395/new/

https://reviews.llvm.org/D136395



More information about the llvm-commits mailing list