[llvm-dev] DWARF Generator

Eric Christopher via llvm-dev llvm-dev at lists.llvm.org
Fri Nov 18 10:28:03 PST 2016


On Fri, Nov 18, 2016 at 10:23 AM David Blaikie <dblaikie at gmail.com> wrote:

> On Fri, Nov 18, 2016 at 10:18 AM Eric Christopher <echristo at gmail.com>
> wrote:
>
> On Fri, Nov 18, 2016 at 8:43 AM Greg Clayton <gclayton at apple.com> wrote:
>
>
> > On Nov 17, 2016, at 5:40 PM, Robinson, Paul <paul.robinson at sony.com>
> wrote:
> >
> >
> >
> >> -----Original Message-----
> >> From: Greg Clayton [mailto:gclayton at apple.com]
> >> Sent: Thursday, November 17, 2016 5:01 PM
> >> To: David Blaikie
> >> Cc: llvm-dev at lists.llvm.org; Robinson, Paul; Eric Christopher; Adrian
> >> Prantl
> >> Subject: Re: [llvm-dev] DWARF Generator
> >>
> >>
> >>> On Nov 17, 2016, at 3:40 PM, David Blaikie <dblaikie at gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> On Thu, Nov 17, 2016 at 3:12 PM Greg Clayton via llvm-dev <llvm-
> >> dev at lists.llvm.org> wrote:
> >>> I have recently been modifying the DWARF parser and have more patches
> >> planned and I want to be able to add unit tests that test the internal
> >> llvm DWARF APIs to ensure they continue to work and also validate the
> >> changes that I am making. There are not many DWARF unit tests other than
> >> very simple ones that test DWARF forms currently. I would like to expand
> >> this to include many more tests.
> >>>
> >>> I had submitted a patch that I aborted as it was too large. One of the
> >> issues with the patch was a stand alone DWARF generator that can turn a
> >> few API calls into the section data required for the
> DWARFContextInMemory
> >> class to be able to load DWARF from. The idea is to generate a small
> blurb
> >> of DWARF, parse it using our built in DWARF parser and validate that the
> >> API calls we do when consuming the DWARF match what we expect. The
> >> original stand along DWARF generator class is in
> >> unittests/DebugInfo/DWARF/DWARFGenerator2.{h,cpp} in the patch attached.
> >> The original review suggested that I try to use the AsmPrinter and many
> of
> >> its associated classes to generate the DWARF. I attempted to do so and
> the
> >> AsmPrinter version is in lib/CodeGen/DwarfGenerator.{h,cpp} in the patch
> >> attached. This AsmPrinter based code steals code from the
> DwarfLinker.cpp.
> >>>
> >>>
> >>>
> >>> I am having trouble getting things to work with the AsmPrinter. I was
> >> able to get simple DWARF to be emitted with the AsmPrinter version of
> the
> >> DWARF generator with code like:
> >>>
> >>>
> >>>   initLLVM();
> >>>   DwarfGen DG;
> >>>   Triple Triple("x86_64--");
> >>>   StringRef Path("/tmp/test.elf");
> >>>   bool DwarfInitSuccess = DG.init(Triple, Path);
> >>>   EXPECT_TRUE(DwarfInitSuccess);
> >>>   uint16_t Version = 4;
> >>>   uint8_t AddrSize = 8;
> >>>   DwarfGenCU &CU = DG.appendCompileUnit(Version, AddrSize);
> >>>   DwarfGenDIE CUDie = CU.getUnitDIE();
> >>>
> >>>   CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c");
> >>>   CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C);
> >>>
> >>>   DwarfGenDIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram);
> >>>   SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main");
> >>>   SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U);
> >>>   SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U);
> >>>
> >>>   DwarfGenDIE IntDie = CUDie.addChild(DW_TAG_base_type);
> >>>   IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int");
> >>>   IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed);
> >>>   IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4);
> >>>
> >>>   DwarfGenDIE ArgcDie =
> >> SubprogramDie.addChild(DW_TAG_formal_parameter);
> >>>   ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc");
> >>>   //ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie); //
> >> Crashes here...
> >>>
> >>>   DG.generate();
> >>>
> >>>   auto Obj = object::ObjectFile::createObjectFile(Path);
> >>>   if (Obj) {
> >>>     DWARFContextInMemory DwarfContext(*Obj.get().getBinary());
> >>>     uint32_t NumCUs = DwarfContext.getNumCompileUnits();
> >>>     for (uint32_t i=0; i<NumCUs; ++i) {
> >>>       DWARFCompileUnit *U = DwarfContext.getCompileUnitAtIndex(i);
> >>>       if (U)
> >>>         U->getUnitDIE(false)->dump(llvm::outs(), U, -1u);
> >>>     }
> >>>   }
> >>>
> >>>
> >>> But things fall down if I try to uncomment the DW_FORM_ref_addr line
> >> above. The problem is that AsmPrinter really expects a full stack of
> stuff
> >> to be there and expects people to use the DwarfDebug class and all of
> its
> >> associated classes. These associated classes really want to use the "DI"
> >> objects (DICompileUnit, etc) so to create a compile unit we would need
> to
> >> create DICompileUnit object and then make a AsmPrinter/DwarfCompileUnit.
> >> That stack is pretty heavy and requires the code shown above to create
> >> many many classes just to represent the simple output we wish to emit.
> >> Another downside of the AsmPrinter method is we don't know which targets
> >> people are going to build into their binaries and thus we don't know
> which
> >> triples we will be able to use when generating DWARF info. Adrian Prantl
> >> attempted to help me get things working over here and we kept running
> into
> >> roadblocks.
> >>>
> >>> It'd be great to have more detail about the roadblocks you hit to
> better
> >> understand how bad/what the issues are.
> >>
> >> A few blocks:
> >>
> >> - DIEString doesn't support DW_FORM_string. DW_FORM_string support might
> >> have been pulled so that we never emit it from clang, but we would want
> to
> >> have a unit test that covers being able to read an inlined C string
> from a
> >> DIE. Support won't be that hard to add, but we might not want it so that
> >> people can't use it by accident and make less efficient DWARF.
> >
> > Seems to me we originally supported only DW_FORM_string, and then at some
> > point it was tossed in favor of DW_FORM_strp in order to get space
> savings
> > from string pooling.  In fact using DW_FORM_string for small strings
> would
> > save some more space (admittedly not much) and a bunch of relocations.
> > (I found data from an old experiment, in a debug build of Clang it saved
> > ~0.7MB out of a total 340MB of debug-info size, and >360K ELF
> relocations.)
>
> This is true, but it also adversely affects DWARF parsing speed as you
> will need to manually skip each C string when parsing the DIEs.
>
>
> Sorry, thread got derailed a bit - the main point was that Greg wants
> support for generating DW_FORM_string to test the parsing support for it
> (for use in the debugger with debug info generated by other sources that do
> use that form), even if LLVM never generates it.
>
> Greg was hesitant to add support for generating it to the existing LLVM
> DWARF Code (lib/CodeGen/AsmPrinter) on the risk that someone might make the
> mistake of writing code in LLVM that would generate that form in its object
> output. I'm pretty comfortable with (& have mentioned above in the thread)
> that risk being quite low, given the way the code works there (we have one
> common API for generating string forms that picks between strp and
> str_index, for example - so I don't think _string would sneak in by
> accident. If it were added, it'd be done so explicitly/intentionally in
> that code with whatever tradeoffs appropriately considered)
>
> I think it's reasonable to add the support in the common APIs so we can
> use it to generate sample DWARF for testing the DWARF parsing APIs.
>
>

Agreed :)

-eric


>
> >
> > I'd favor an API that passed the string down and let the DIE generator
> > (as opposed to the DWARF generator) pick the form.
>
> I have currently added a DIEInlinedString class that can be used for
> DW_FORM_string attributes.
>
> >
> >> - Asserts, asserts, asserts. As we tried to emit DWARF, we got an
> asserts
> >> in bool AsmPrinter::doInitialization(Module &M). On the first line:
> >>
> >>  MMI = getAnalysisIfAvailable<MachineModuleInfo>();
> >>
> >> This asserts if you use the AsmPrinter the way the DwarfLinker and the
> >> AsmPrinter based DwarfGen does if you call this. You must call this to
> >> generate the DebugDwarf. If you get past this by installing a Pass then
> we
> >> assert at:
> >>
> >>  GCModuleInfo *MI = getAnalysisIfAvailable<GCModuleInfo>();
> >>  assert(MI && "AsmPrinter didn't require GCModuleInfo?");
> >>
> >> If we don't have this, we don't get a DwarfDebug.
> >>
> >>>
> >>> Even if we end up adding another set of code to generate DWARF (which
> >> I'd really like to avoid) we'd want to, at some point, coalesce them
> back
> >> together. Given the goal is to try to coalesce the DWARF parsing code in
> >> LLDB and LLVM, it'd seem unfortunate if that effort just created another
> >> similar (or larger) amount of work for DWARF generation.
> >>
> >> This DWARF generator could just live in the unittests/DebugInfo/DWARF
> >> directory so it wouldn't pollute anything in LLVM it we do choose to use
> >> it.
> >>>
> >>> I wanted to pass this patch along in case someone wants to take a look
> >> at how we can possibly fix the lib/CodeGen/DwarfGenerator.cpp and
> >> lib/CodeGen/DwarfGenerator.h. The code that sets up all the required
> >> classes for the AsmPrinter method is in the DwarfGen class from
> >> lib/CodeGen/DwarfGenerator.cpp in the following function:
> >>>
> >>> bool DwarfGen::init(Triple TheTriple, StringRef OutputFilename);
> >>>
> >>> The code in this function was looted from existing DwarfLinker.cpp
> code.
> >> This functions requires a valid triple and that triple is used to
> create a
> >> lot of the classes required to make the AsmPrinter. I am not sure if any
> >> other code uses the AsmPrinter like this besides the DwarfLinker.cpp
> code
> >> and that code uses its own magic to actually link the DWARF. It does
> reuse
> >> some of the functions as I did, but the DwarfLinker doesn't use any of
> the
> >> DwarfDebug, DwarfCompileUnit or any of the classes that the
> >> compiler/assembler uses when making DWARF.
> >>>
> >>> What's the DwarfLinker code missing that you need? If that code is
> >> generating essentially arbitrary DWARF, what's blocking using the same
> >> technique for generating DWARF for parsing tests?
> >>
> >> They don't use any of the DwarfDebug, DwarfCompileUnit classes. They
> also
> >> don't use any of the DI classes when making up the debug info. So both
> the
> >> DWARF linker and the generator have similar needs: make DWARF that isn't
> >> tied too closely to the clang internal classes and DI classes.
> >>>
> >>> The amount of work required for refactoring the AsmPrinter exceeds the
> >> time I am going to have, but I would still like to have DWARF API
> testing
> >> in the unit tests.
> >>>
> >>> So my question is if anyone would have objections to using the stand
> >> along DWARF generator in unittests/DebugInfo/DWARF until we can later
> get
> >> the YAML tools to be able to produce DWARF and we can switch to testing
> >> the DWARF data that way? Chris Bieneman has expressed interest in
> getting
> >> a DWARF/YAML layer going.
> >>>
> >>> Those tools would still want to use pretty similar (conceptually)
> >> abstractions to LLVM's codegen and llvm-dsymutil. I'd still strongly
> >> prefer to generalize/keep common APIs here - or better understand why
> it's
> >> not practical now (& what it will take/how we make sure we have a plan
> and
> >> resources to get there eventually).
> >>>
> >>> My reasoning is:
> >>> - I want to be able to test DWARF APIs we have to ensure they work
> >> correctly as there are no Dwarf API tests right now. I will be adding
> code
> >> that changes many things in the DWARF parser and it will be essential to
> >> verify that there are no regressions in the DWARF APIs.
> >>> - Not sure which targets would be built into LLVM so it might be hard
> to
> >> write tests that cover 32/64 bit addresses and all the variants if we
> have
> >> to do things legally via AsmPrinter and valid targets
> >>>
> >>> Seems like it might be plausible to refactor out whatever features of
> >> the AsmPrinter these APIs require (so we just harvest that data out of
> >> AsmPrinter and pass it down in a struct, say - so that other users can
> >> pass their own struct without needing an AsmPrinter). Though, again,
> >> interested to know how dsymutil is working in these situations.
> >>
> >> I can try that method if indeed the only places that use the DwarfDebug
> >> are the DW_FORM_ref_addr and location lists. I'll let you know how that
> >> goes.
> >>>
> >>> - Not enough time to modify AsmPrinter to not require the full
> DebugInfo
> >> stack and the classes that it uses (llvm::DwarfCompileUnit which must
> use
> >> llvm::DICompileUnit, llvm::DIE class which uses many local classes that
> >> all depend on  the full DwarfDebug stack).
> >>>
> >>> Will you have time at some later date to come back and revisit this?
> >> It's understandable that we may choose to incur short term technical
> debt
> >> with an understanding that it will be paid off in some timely manner.
> It'd
> >> be less desirable if there's no such plan/possibility and we incur a
> >> fairly clear case of technical debt (redundant DWARF generation
> libraries
> >> - especially when this effort is to remove a redundant DWARF parser).
> >>
> >> Not sure anyone else will need to generate DWARF manually. The two
> clients
> >> currently are the DWARF unittests and the DwarfLinker. The DwarfLinker
> >> worked around these issues. If the AsmPrinter wasn't such an integral
> part
> >> of the entire compiler stack, I could take a stab at refactoring it,
> but I
> >> don't believe I am the right person to do this at this point as I have
> no
> >> experience or knowledge of the various ways that this class is used, or
> >> how it interacts with other support classes (DwarfDebug, and many many
> >> other classes).
> >>
> >> Things that still worry me:
> >> - not being able to generate DWARF for 32/64 if targets are missing
> >
> > You mean DWARF-32 and DWARF-64 formats?  LLVM doesn't do DWARF-64.
> > If you mean 64-bit target-machine addresses, I guess I don't understand
> > the problem.  If you have target-dependent tests, then they only work
> > when the right targets are there.  This is extremely common and I'm
> > not clear why it would be a problem for the DWARF tests.
>
> I wasn't aware that there were target-dependent tests. Do you know of one
> in the unittest directory you can point me to? I did mean 32 bit address
> target, versus 64 bit address targets. I am not sure how I can test 4 and 8
> byte addresses reliably. What triple to I use in the unittest? I can't
> assume x86_64 as we may have been built on a 32 bit ARM system with only
> the 32 bit ARM targets.
> >
> >> - DIEString not supporting DW_FORM_string. I can add support, but I
> don't
> >> know if we want it as if we add it people might start using it.
> >
> > See above. If the API picked the form this would not be a concern.
>
> For DWARF parsing speed I still like the DW_FORM_strp.
>
>
> FWIW I'm with Greg here. I don't find that the "inlined small strings"
> optimization is really worth it for size, but could be convinced to add it.
>
> -eric
>
>
>
> >
> >> - hacking around asserts by constructing classes and copying code from
> >> places that properly use the AsmPrinter that way it is supposed to be
> used
> >> so that we can use it in a way that it wasn't designed to be used.
> >>
> >>>
> >>> I made a large effort to try and get things working with the
> AsmPrinter,
> >> so I wanted everyone to know that I tried to get that solution working.
> >> Let me know what you anyone thinks.
> >>>
> >>> Greg Clayton
> >
> > --paulr
> >
> >>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161118/e0ab384d/attachment.html>


More information about the llvm-dev mailing list