[PATCH] D123538: [symbolizer] Parse DW_TAG_variable DIs to show line info for globals

Thu Apr 21 14:33:24 PDT 2022

hctim marked 11 inline comments as done.
hctim added inline comments.

================
Comment at: llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h:248
+  std::unordered_map<uint64_t, DWARFDie> VariableDieMap;
+  std::unordered_set<uint64_t> RootsParsedForVariables;
+
----------------
dblaikie wrote:
> hctim wrote:
> > tschuett wrote:
> > > Are you sure about your map and set types?
> > > https://llvm.org/docs/ProgrammersManual.html#other-set-like-container-options
> > thanks, changed to set/map given that bionic's `libc.so` has 10.5k DW_TAG_variables, so DenseSet/DenseMap doesn't seem like the right structure either.
> I don't believe the number of elements is an argument against the dense data structures - could you help explain/understand the connection there?
changed, thanks for clarifying before

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:744
+void DWARFUnit::updateVariableDieMap(DWARFDie Die) {
+  for (DWARFDie Child = Die.getFirstChild(); Child; Child = Child.getSibling())
+    updateVariableDieMap(Child);
----------------
dblaikie wrote:
> maybe DWARFDie supports range-based for loop now?
> 
Neat!

================
Comment at: llvm/test/tools/llvm-symbolizer/data-location.s:16
+################################################################################
+## File below was generated using:
+##   $ clang -g -fuse-ld=lld -shared /tmp/file.c -o out.so && obj2yaml out.so
----------------
dblaikie wrote:
> jhenderson wrote:
> > hctim wrote:
> > > dblaikie wrote:
> > > > hctim wrote:
> > > > > jhenderson wrote:
> > > > > > One of the purposes of the cross-project-tests project was to make it  easier to test llvm-symbolizer without having to resort to canned binaries or canned YAML blobs like below. By writing the test there, you can use clang and lld directly and consequently start from a much more understandable .c file.
> > > > > > 
> > > > > > That being said, I wonder if you'd be better off hand-crafting this YAML. It doesn't seem likely to be that complicated to write some .debug_info that has the property you want using yaml2obj's DWARF support, if I follow the required behaviour correctly. There are already some good examples of similar bits. The advantage with using yaml like that is that you can keep it tightly focused on what is important, omitting the various components of the object that are unrelated to the test (e.g. .eh_frame).
> > > > > I had a quick look at cross-project-tests, and didn't find any examples where llvm-symbolizer was currently being tests there, and I think it's out of scope for this patch.
> > > > > 
> > > > > The right way to test this is to invoke clang and run llvm-symbolizer as the output.
> > > > > 
> > > > > My problem with hand-crafting is that it's:
> > > > > 
> > > > >  a) Very time consuming, looking into it further I'd have to read a lot more about the ELF spec for .debug_abbrev, .debug_info, and .debug_line to understand what headers are needed.
> > > > > 
> > > > >  b) Hard to change. If someone introduces some new feature, then they have to go through the same "understand nuance of ELF" to just modify my test so that it passes. It's much easier for someone to go "hey, just need to recompile this example file and make sure the new golden file passes the existing test".
> > > > (re: cross-project-tests: They must not be used as a replacement for individual project testing - but for added feature/cross-project coverage if there's important interactions between components. But an LLVM patch should still have an LLVM test, a Clang patch with a Clang test, etc - and optionally some additional cross-project feature testing)
> > > > 
> > > > re: hand crafting: Yeah, I'm not a super fan of doing that, until/unless yaml2obj can really simplify writing DWARF down to not needing to write abbreviations, multiple sections, attribute forms, etc.
> > > > 
> > > > Though is there any reason this test needs a function in it? & any reason it needs two integers?
> > > I updated the test a little and added some more of what I was hoping for.
> > > 
> > > One global in data, one in bss, one as a function-static global and a string (this patch now depends on D123534).
> > > (re: cross-project-tests: They must not be used as a replacement for individual project testing - but for added feature/cross-project coverage if there's important interactions between components. But an LLVM patch should still have an LLVM test, a Clang patch with a Clang test, etc - and optionally some additional cross-project feature testing)
> > 
> > That is one purpose for why I created this testsuite, but the other was for using tools to generate test inputs at runtime, such as clang or lld. In the original thread, I even discussed llvm-symbolizer as a concrete example (using LLD in that particular case, in the same manner as one might use llvm-mc for generating test input at test time, because creating canned test inputs means they're not easily maintainable). There are no current examples of this yet though.
> (this might be shorter/simpler to read as assembly, rather than yaml?)
> (this might be shorter/simpler to read as assembly, rather than yaml?)

Originally that's what I thought of doing, but `llvm-symbolizer` doesn't play nicely with objfiles because the symbol addr is a section offset, so `f` is `.text+0x0` and `data_global` is `.data+0x0`, and llvm-symbolizer only spits out one of them.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123538/new/

https://reviews.llvm.org/D123538