[llvm-dev] DWARF Generator

Robinson, Paul via llvm-dev llvm-dev at lists.llvm.org
Thu Nov 17 17:40:29 PST 2016



> -----Original Message-----
> From: Greg Clayton [mailto:gclayton at apple.com]
> Sent: Thursday, November 17, 2016 5:01 PM
> To: David Blaikie
> Cc: llvm-dev at lists.llvm.org; Robinson, Paul; Eric Christopher; Adrian
> Prantl
> Subject: Re: [llvm-dev] DWARF Generator
> 
> 
> > On Nov 17, 2016, at 3:40 PM, David Blaikie <dblaikie at gmail.com> wrote:
> >
> >
> >
> > On Thu, Nov 17, 2016 at 3:12 PM Greg Clayton via llvm-dev <llvm-
> dev at lists.llvm.org> wrote:
> > I have recently been modifying the DWARF parser and have more patches
> planned and I want to be able to add unit tests that test the internal
> llvm DWARF APIs to ensure they continue to work and also validate the
> changes that I am making. There are not many DWARF unit tests other than
> very simple ones that test DWARF forms currently. I would like to expand
> this to include many more tests.
> >
> > I had submitted a patch that I aborted as it was too large. One of the
> issues with the patch was a stand alone DWARF generator that can turn a
> few API calls into the section data required for the DWARFContextInMemory
> class to be able to load DWARF from. The idea is to generate a small blurb
> of DWARF, parse it using our built in DWARF parser and validate that the
> API calls we do when consuming the DWARF match what we expect. The
> original stand along DWARF generator class is in
> unittests/DebugInfo/DWARF/DWARFGenerator2.{h,cpp} in the patch attached.
> The original review suggested that I try to use the AsmPrinter and many of
> its associated classes to generate the DWARF. I attempted to do so and the
> AsmPrinter version is in lib/CodeGen/DwarfGenerator.{h,cpp} in the patch
> attached. This AsmPrinter based code steals code from the DwarfLinker.cpp.
> >
> >
> >
> > I am having trouble getting things to work with the AsmPrinter. I was
> able to get simple DWARF to be emitted with the AsmPrinter version of the
> DWARF generator with code like:
> >
> >
> >    initLLVM();
> >    DwarfGen DG;
> >    Triple Triple("x86_64--");
> >    StringRef Path("/tmp/test.elf");
> >    bool DwarfInitSuccess = DG.init(Triple, Path);
> >    EXPECT_TRUE(DwarfInitSuccess);
> >    uint16_t Version = 4;
> >    uint8_t AddrSize = 8;
> >    DwarfGenCU &CU = DG.appendCompileUnit(Version, AddrSize);
> >    DwarfGenDIE CUDie = CU.getUnitDIE();
> >
> >    CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c");
> >    CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C);
> >
> >    DwarfGenDIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram);
> >    SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main");
> >    SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U);
> >    SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U);
> >
> >    DwarfGenDIE IntDie = CUDie.addChild(DW_TAG_base_type);
> >    IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int");
> >    IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed);
> >    IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4);
> >
> >    DwarfGenDIE ArgcDie =
> SubprogramDie.addChild(DW_TAG_formal_parameter);
> >    ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc");
> >    //ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie); //
> Crashes here...
> >
> >    DG.generate();
> >
> >    auto Obj = object::ObjectFile::createObjectFile(Path);
> >    if (Obj) {
> >      DWARFContextInMemory DwarfContext(*Obj.get().getBinary());
> >      uint32_t NumCUs = DwarfContext.getNumCompileUnits();
> >      for (uint32_t i=0; i<NumCUs; ++i) {
> >        DWARFCompileUnit *U = DwarfContext.getCompileUnitAtIndex(i);
> >        if (U)
> >          U->getUnitDIE(false)->dump(llvm::outs(), U, -1u);
> >      }
> >    }
> >
> >
> > But things fall down if I try to uncomment the DW_FORM_ref_addr line
> above. The problem is that AsmPrinter really expects a full stack of stuff
> to be there and expects people to use the DwarfDebug class and all of its
> associated classes. These associated classes really want to use the "DI"
> objects (DICompileUnit, etc) so to create a compile unit we would need to
> create DICompileUnit object and then make a AsmPrinter/DwarfCompileUnit.
> That stack is pretty heavy and requires the code shown above to create
> many many classes just to represent the simple output we wish to emit.
> Another downside of the AsmPrinter method is we don't know which targets
> people are going to build into their binaries and thus we don't know which
> triples we will be able to use when generating DWARF info. Adrian Prantl
> attempted to help me get things working over here and we kept running into
> roadblocks.
> >
> > It'd be great to have more detail about the roadblocks you hit to better
> understand how bad/what the issues are.
> 
> A few blocks:
> 
> - DIEString doesn't support DW_FORM_string. DW_FORM_string support might
> have been pulled so that we never emit it from clang, but we would want to
> have a unit test that covers being able to read an inlined C string from a
> DIE. Support won't be that hard to add, but we might not want it so that
> people can't use it by accident and make less efficient DWARF.

Seems to me we originally supported only DW_FORM_string, and then at some
point it was tossed in favor of DW_FORM_strp in order to get space savings
from string pooling.  In fact using DW_FORM_string for small strings would
save some more space (admittedly not much) and a bunch of relocations.
(I found data from an old experiment, in a debug build of Clang it saved 
~0.7MB out of a total 340MB of debug-info size, and >360K ELF relocations.)

I'd favor an API that passed the string down and let the DIE generator
(as opposed to the DWARF generator) pick the form.

> - Asserts, asserts, asserts. As we tried to emit DWARF, we got an asserts
> in bool AsmPrinter::doInitialization(Module &M). On the first line:
> 
>   MMI = getAnalysisIfAvailable<MachineModuleInfo>();
> 
> This asserts if you use the AsmPrinter the way the DwarfLinker and the
> AsmPrinter based DwarfGen does if you call this. You must call this to
> generate the DebugDwarf. If you get past this by installing a Pass then we
> assert at:
> 
>   GCModuleInfo *MI = getAnalysisIfAvailable<GCModuleInfo>();
>   assert(MI && "AsmPrinter didn't require GCModuleInfo?");
> 
> If we don't have this, we don't get a DwarfDebug.
> 
> >
> > Even if we end up adding another set of code to generate DWARF (which
> I'd really like to avoid) we'd want to, at some point, coalesce them back
> together. Given the goal is to try to coalesce the DWARF parsing code in
> LLDB and LLVM, it'd seem unfortunate if that effort just created another
> similar (or larger) amount of work for DWARF generation.
> 
> This DWARF generator could just live in the unittests/DebugInfo/DWARF
> directory so it wouldn't pollute anything in LLVM it we do choose to use
> it.
> >
> > I wanted to pass this patch along in case someone wants to take a look
> at how we can possibly fix the lib/CodeGen/DwarfGenerator.cpp and
> lib/CodeGen/DwarfGenerator.h. The code that sets up all the required
> classes for the AsmPrinter method is in the DwarfGen class from
> lib/CodeGen/DwarfGenerator.cpp in the following function:
> >
> > bool DwarfGen::init(Triple TheTriple, StringRef OutputFilename);
> >
> > The code in this function was looted from existing DwarfLinker.cpp code.
> This functions requires a valid triple and that triple is used to create a
> lot of the classes required to make the AsmPrinter. I am not sure if any
> other code uses the AsmPrinter like this besides the DwarfLinker.cpp code
> and that code uses its own magic to actually link the DWARF. It does reuse
> some of the functions as I did, but the DwarfLinker doesn't use any of the
> DwarfDebug, DwarfCompileUnit or any of the classes that the
> compiler/assembler uses when making DWARF.
> >
> > What's the DwarfLinker code missing that you need? If that code is
> generating essentially arbitrary DWARF, what's blocking using the same
> technique for generating DWARF for parsing tests?
> 
> They don't use any of the DwarfDebug, DwarfCompileUnit classes. They also
> don't use any of the DI classes when making up the debug info. So both the
> DWARF linker and the generator have similar needs: make DWARF that isn't
> tied too closely to the clang internal classes and DI classes.
> >
> > The amount of work required for refactoring the AsmPrinter exceeds the
> time I am going to have, but I would still like to have DWARF API testing
> in the unit tests.
> >
> > So my question is if anyone would have objections to using the stand
> along DWARF generator in unittests/DebugInfo/DWARF until we can later get
> the YAML tools to be able to produce DWARF and we can switch to testing
> the DWARF data that way? Chris Bieneman has expressed interest in getting
> a DWARF/YAML layer going.
> >
> > Those tools would still want to use pretty similar (conceptually)
> abstractions to LLVM's codegen and llvm-dsymutil. I'd still strongly
> prefer to generalize/keep common APIs here - or better understand why it's
> not practical now (& what it will take/how we make sure we have a plan and
> resources to get there eventually).
> >
> > My reasoning is:
> > - I want to be able to test DWARF APIs we have to ensure they work
> correctly as there are no Dwarf API tests right now. I will be adding code
> that changes many things in the DWARF parser and it will be essential to
> verify that there are no regressions in the DWARF APIs.
> > - Not sure which targets would be built into LLVM so it might be hard to
> write tests that cover 32/64 bit addresses and all the variants if we have
> to do things legally via AsmPrinter and valid targets
> >
> > Seems like it might be plausible to refactor out whatever features of
> the AsmPrinter these APIs require (so we just harvest that data out of
> AsmPrinter and pass it down in a struct, say - so that other users can
> pass their own struct without needing an AsmPrinter). Though, again,
> interested to know how dsymutil is working in these situations.
> 
> I can try that method if indeed the only places that use the DwarfDebug
> are the DW_FORM_ref_addr and location lists. I'll let you know how that
> goes.
> >
> > - Not enough time to modify AsmPrinter to not require the full DebugInfo
> stack and the classes that it uses (llvm::DwarfCompileUnit which must use
> llvm::DICompileUnit, llvm::DIE class which uses many local classes that
> all depend on  the full DwarfDebug stack).
> >
> > Will you have time at some later date to come back and revisit this?
> It's understandable that we may choose to incur short term technical debt
> with an understanding that it will be paid off in some timely manner. It'd
> be less desirable if there's no such plan/possibility and we incur a
> fairly clear case of technical debt (redundant DWARF generation libraries
> - especially when this effort is to remove a redundant DWARF parser).
> 
> Not sure anyone else will need to generate DWARF manually. The two clients
> currently are the DWARF unittests and the DwarfLinker. The DwarfLinker
> worked around these issues. If the AsmPrinter wasn't such an integral part
> of the entire compiler stack, I could take a stab at refactoring it, but I
> don't believe I am the right person to do this at this point as I have no
> experience or knowledge of the various ways that this class is used, or
> how it interacts with other support classes (DwarfDebug, and many many
> other classes).
> 
> Things that still worry me:
> - not being able to generate DWARF for 32/64 if targets are missing

You mean DWARF-32 and DWARF-64 formats?  LLVM doesn't do DWARF-64.
If you mean 64-bit target-machine addresses, I guess I don't understand
the problem.  If you have target-dependent tests, then they only work
when the right targets are there.  This is extremely common and I'm 
not clear why it would be a problem for the DWARF tests.

> - DIEString not supporting DW_FORM_string. I can add support, but I don't
> know if we want it as if we add it people might start using it.

See above. If the API picked the form this would not be a concern.

> - hacking around asserts by constructing classes and copying code from
> places that properly use the AsmPrinter that way it is supposed to be used
> so that we can use it in a way that it wasn't designed to be used.
> 
> >
> > I made a large effort to try and get things working with the AsmPrinter,
> so I wanted everyone to know that I tried to get that solution working.
> Let me know what you anyone thinks.
> >
> > Greg Clayton

--paulr

> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



More information about the llvm-dev mailing list