[PATCH] D82367: [ObjectYAML][ELF] Add support for emitting the .debug_gnu_pubnames/pubtypes sections.

Sun Jul 12 09:35:14 PDT 2020

dblaikie added inline comments.

================
Comment at: llvm/test/tools/yaml2obj/ELF/DWARF/debug-gnu-pubnames.yaml:8-9
+# RUN: yaml2obj --docnum=1 -DENDIAN=ELFDATA2LSB %s -o %t1.le.o
+# RUN: llvm-readobj --sections --section-data %t1.le.o | \
+# RUN:   FileCheck -DSIZE=32 -DADDRALIGN=1 %s --check-prefixes=SHDR,DWARF32-LE
+
----------------
Higuoxing wrote:
> dblaikie wrote:
> > jhenderson wrote:
> > > dblaikie wrote:
> > > > jhenderson wrote:
> > > > > dblaikie wrote:
> > > > > > Higuoxing wrote:
> > > > > > > jhenderson wrote:
> > > > > > > > dblaikie wrote:
> > > > > > > > > Higuoxing wrote:
> > > > > > > > > > dblaikie wrote:
> > > > > > > > > > > Should this be tested via llvm-dwarfdump instead? (perhaps there's already lots of precedent/reasons that yaml2obj is being tested via readobj?)
> > > > > > > > > > Because some tests in llvm-dwarfdump are using yaml2obj to generate DWARF sections, e.g., llvm-dwarfdump/X86/verify_overlapping_cu_ranges.yaml, llvm-dwarfdump/X86/Inputs/i386_macho_with_debug.yaml, etc. We don't want to create a circular dependency. Does it make sense?
> > > > > > > > > Hmm, fair enough. Not sure what the right call is there - I would've thought assembly would be easier to read than hex object dumps? Case in point with these hex dumps and multiline ASCII art comments, compared to assembly with comments & appropriate-width values, symbolic expressions, etc.
> > > > > > > > > 
> > > > > > > > > (so using assembly tests for llvm-dwarfdump and then llvm-dwarfdump for tests of obj2yaml, rather than obj2yaml tests of llvm-dwarfdump and objdump tests of obj2yaml)
> > > > > > > > (just in case you missed it, this is a yaml2obj test). The intent longer term with @Higuoxing's project is to get yaml2obj DWARF support to a good enough state that it makes it much easier to craft tests for llvm-dwarfdump etc without needing to specify all the fine details that assembly currently requires (just consider how much assembly some of the exisiting llvm-dwarfdump tests require for example). Assembly would probably still work well for creating broken inputs, but yaml2obj would be better for the higher-level testing.
> > > > > > > > 
> > > > > > > > The problem of course with using yaml2obj to test llvm-dwarfdump is that we can't use the reverse. Somewhere, we have to test either hex output or use assembly (or YAML + raw content hex) input. Whilst I agree assembly input would be easier to read than this hex output, it rather defeats the point of the project, and it doesn't scale well (in theory, the testing here can be kept fairly small, so the costs of having hex aren't too great).
> > > > > > > > 
> > > > > > > > Once we have basic testing in place for all the DWARF sections, it should be possible to use llvm-dwarfdump to verify the higher level auto-generation of things by yaml2obj that is intended for later in the project.
> > > > > > > Oops, I missed @dblaikie 's previous comments. Thank you @jhenderson for clarifying this for me!
> > > > > > > Whilst I agree assembly input would be easier to read than this hex output, it rather defeats the point of the project, and it doesn't scale well (in theory, the testing here can be kept fairly small, so the costs of having hex aren't too great).
> > > > > > 
> > > > > > Not sure - why is it likely that the yaml2obj+hexdump tests scale better than the assembly+llvm-dwarfdump tests directly? Seems like we'd have to test maybe as many weird cases of DWARF emission to get a nice legible format for writing dwarfdump tests as we would for the dwarfdump tests themselves? It's starting to feel a bit "turtles all the way down" to me.
> > > > > > 
> > > > > > Something like yaml2obj could be handy for testing lldb, for instance - constructing arbitrarily interesting inputs. But for the yaml2obj<>llvm-dwarfdump circularity, I'm not so sure.
> > > > > By "scale" I meant the auto-generation aspects probably don't need to be tested using hex dumps, so can be tested using llvm-dwarfdump, but honestly I'm not sure either way too.
> > > > > By "scale" I meant the auto-generation aspects probably don't need to be tested using hex dumps, so can be tested using llvm-dwarfdump, but honestly I'm not sure either way too.
> > > > 
> > > > What do you mean by "auto-generation aspects"?
> > > > 
> > > > But, yeah, I'm not holding this patch up over this direction that's already got precedent, etc - but raising the question at least for consideration/thinking about over time.
> > > At the moment, to use yaml2obj to generate DWARF, you have to specify pretty much every detail of the DWARF, including the details of the abbrev table and the string table for example. Ideally, we should be able to describe the DWARF in a higher level manner (e.g. by just specifying the attributes and values in the .debug_info description, letting yaml2obj do all the leg work of selecting a form, populating the abbrev and string tables etc). You'll see details of this in @Higuoxing's mailing list posts about his GSOC project.
> > > 
> > > We can use the basic-level testing for "bootstrapping". yaml2obj can generate valid raw sections, tested via hex -> allows testing of llvm-dwarfdump section dumping -> allows testing of yaml2obj higher-level functionality (because we know that llvm-dwarfdump section dumping now works).
> > That seems like it's going to be fairly subtle/hard to maintain the separation here - if some yaml2obj tests use hex dumping but others can use llvm-dwarfdump - if/when/that's happening, might be worth separate directories for the two kinds of tests and some fairly specific documentation about how to determine which tests go where.
> What do you think of making elf2yaml support dumping DWARF sections? In the future, we can use raw assembly to test elf2yaml and use elf2yaml to test yaml2elf.
Probably useful that elf2yaml and yaml2elf roundtrip/support the same features (would make it easier to create yaml files to work with/pare down, etc).

But as for testing - not sure - seems like it adds another layer of indirection (then we'd use raw assembly+llvm-mc to test elf2yaml, to test yaml2elf, to test llvm-dwarfdump - when we could've been using raw assembly to test llvm-dwarfdump) & not sure how much it improves/streamlines the testing matrix.

All that said, we did used to test llvm-dwarfdump with checked in object files - then we accepted that assembly + llvm-mc didn't especially reduce the test quality despite increasing the surface area of the test by using llvm-mc. Though I think the more DWARF-specific the functionality gets the less that sort of line of reasoning applies (ie: Once we're generating all of DWARF - we're reaching the same complexity as the parsing logic and have now written a whole other DWARF representation with all the risk of bugs, etc).

But really - I don't have any particular action/takeaway from these thoughts right now, but I think they're worth keeping in mind/thinking about as this work continues. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82367/new/

https://reviews.llvm.org/D82367