[PATCH] D123538: [symbolizer] Parse DW_TAG_variable DIs to show line info for globals

Mitch Phillips via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 18 14:45:39 PDT 2022


hctim added inline comments.


================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:769-783
+    DataExtractor Data(Location.Expr, /*IsLittleEndian=*/true, 8);
+    uint64_t DataOffset = 0;
+    uint8_t Operation = Data.getU8(&DataOffset);
+    if (Operation == dwarf::DW_OP_addr) {
+      uint64_t Pointer = Data.getAddress(&DataOffset);
+      VariableDieMap[Pointer] = Die;
+      return;
----------------
dblaikie wrote:
> hctim wrote:
> > dblaikie wrote:
> > > Are there any examples of global merge where this might not be correct?
> > > 
> > > Like if a global was described as "addrx, 5, add" (eg: 5 beyond this address) then only looking at the first operation might mis-conclude that the variable is at this address when it's at 5 bytes past this address - and some other variable is at this address)
> > > 
> > > LLVM has a "global merge" optimization that might cause something like this. Let's see if I can create an example.
> > > 
> > > Ah, yeah, here we go:
> > > ```
> > > static int x;
> > > static int y;
> > > int *f(bool b) { return b ? &x : &y; }
> > > ```
> > > ```
> > > $ clang++-tot -O3 merge.cpp -c -g -target aarch64-linux-gnuabi -mllvm -global-merge-ignore-single-use && llvm-dwarfdump-tot merge.o | grep DW_AT_location
> > >                 DW_AT_location  (DW_OP_addrx 0x0)
> > >                 DW_AT_location  (DW_OP_addrx 0x0, DW_OP_plus_uconst 0x4)
> > > ```
> > > 
> > > (the `-global-merge-ignore-single-use` is to simplify the test case - without that it could still be tickled with a more complicated example - seems we only enable global merge on ARM 32 and 64 bit due to the higher register pressure there, by the sounds of it)
> > > 
> > > I'm guessing this patch would overwrite the VariableDieMap entry for the first global with the second one so queries for the first address would result in the second DIE and queries for the second address wouldn't give any result?
> > > 
> > > Hmm, also - how does this handle queries that aren't at exactly the starting address of the variable? Presumably the `DW_AT_type` of the `DW_TAG_global_variable` would have to be inspected to find the size of the variable starting at the address, and any query in that range should be successful?
> > I've added some handling for the `DW_OP_plus_uconst`. Unfortunately I didn't find any generic existing handling for the expression evaluation (which surprised me, maybe I just suck at looking), so I'm just making the assumption that global variable addresses aren't described using `DW_OP_plus` or anything else...
> > 
> > Fixed it up to now provide line info when you provide an address halfway through a symbol. Good thing there wasn't a big impact in D123534 about keeping the string sizes around :). Worst case (if the type info sucks) we just only accept the precise start addresses of the symbols.
> Probably worth maybe /only/ matching these expressions, then, rather than currently it's matching any expression that starts with addr, or addr+plus? Even if that's followed by any other garbage that would be ignored (& so the expression would be misinterpreted)
> 
> There's probably no expression evaluation logic in LLVM's libDebugInfoDWARF - there's no need to evaluate expressions when symbolizing, and dwarfdump just wants to dump it, not compute the resulting value. (some similar code exists in lldb that would do evaluation).
Sure, done.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123538/new/

https://reviews.llvm.org/D123538



More information about the cfe-commits mailing list