[llvm-dev] DWARF Generator

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Fri Nov 18 10:23:28 PST 2016


On Fri, Nov 18, 2016 at 10:18 AM Eric Christopher <echristo at gmail.com>
wrote:

> On Fri, Nov 18, 2016 at 8:43 AM Greg Clayton <gclayton at apple.com> wrote:
>
>
> > On Nov 17, 2016, at 5:40 PM, Robinson, Paul <paul.robinson at sony.com>
> wrote:
> >
> >
> >
> >> -----Original Message-----
> >> From: Greg Clayton [mailto:gclayton at apple.com]
> >> Sent: Thursday, November 17, 2016 5:01 PM
> >> To: David Blaikie
> >> Cc: llvm-dev at lists.llvm.org; Robinson, Paul; Eric Christopher; Adrian
> >> Prantl
> >> Subject: Re: [llvm-dev] DWARF Generator
> >>
> >>
> >>> On Nov 17, 2016, at 3:40 PM, David Blaikie <dblaikie at gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> On Thu, Nov 17, 2016 at 3:12 PM Greg Clayton via llvm-dev <llvm-
> >> dev at lists.llvm.org> wrote:
> >>> I have recently been modifying the DWARF parser and have more patches
> >> planned and I want to be able to add unit tests that test the internal
> >> llvm DWARF APIs to ensure they continue to work and also validate the
> >> changes that I am making. There are not many DWARF unit tests other than
> >> very simple ones that test DWARF forms currently. I would like to expand
> >> this to include many more tests.
> >>>
> >>> I had submitted a patch that I aborted as it was too large. One of the
> >> issues with the patch was a stand alone DWARF generator that can turn a
> >> few API calls into the section data required for the
> DWARFContextInMemory
> >> class to be able to load DWARF from. The idea is to generate a small
> blurb
> >> of DWARF, parse it using our built in DWARF parser and validate that the
> >> API calls we do when consuming the DWARF match what we expect. The
> >> original stand along DWARF generator class is in
> >> unittests/DebugInfo/DWARF/DWARFGenerator2.{h,cpp} in the patch attached.
> >> The original review suggested that I try to use the AsmPrinter and many
> of
> >> its associated classes to generate the DWARF. I attempted to do so and
> the
> >> AsmPrinter version is in lib/CodeGen/DwarfGenerator.{h,cpp} in the patch
> >> attached. This AsmPrinter based code steals code from the
> DwarfLinker.cpp.
> >>>
> >>>
> >>>
> >>> I am having trouble getting things to work with the AsmPrinter. I was
> >> able to get simple DWARF to be emitted with the AsmPrinter version of
> the
> >> DWARF generator with code like:
> >>>
> >>>
> >>>   initLLVM();
> >>>   DwarfGen DG;
> >>>   Triple Triple("x86_64--");
> >>>   StringRef Path("/tmp/test.elf");
> >>>   bool DwarfInitSuccess = DG.init(Triple, Path);
> >>>   EXPECT_TRUE(DwarfInitSuccess);
> >>>   uint16_t Version = 4;
> >>>   uint8_t AddrSize = 8;
> >>>   DwarfGenCU &CU = DG.appendCompileUnit(Version, AddrSize);
> >>>   DwarfGenDIE CUDie = CU.getUnitDIE();
> >>>
> >>>   CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c");
> >>>   CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C);
> >>>
> >>>   DwarfGenDIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram);
> >>>   SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main");
> >>>   SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U);
> >>>   SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U);
> >>>
> >>>   DwarfGenDIE IntDie = CUDie.addChild(DW_TAG_base_type);
> >>>   IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int");
> >>>   IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed);
> >>>   IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4);
> >>>
> >>>   DwarfGenDIE ArgcDie =
> >> SubprogramDie.addChild(DW_TAG_formal_parameter);
> >>>   ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc");
> >>>   //ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie); //
> >> Crashes here...
> >>>
> >>>   DG.generate();
> >>>
> >>>   auto Obj = object::ObjectFile::createObjectFile(Path);
> >>>   if (Obj) {
> >>>     DWARFContextInMemory DwarfContext(*Obj.get().getBinary());
> >>>     uint32_t NumCUs = DwarfContext.getNumCompileUnits();
> >>>     for (uint32_t i=0; i<NumCUs; ++i) {
> >>>       DWARFCompileUnit *U = DwarfContext.getCompileUnitAtIndex(i);
> >>>       if (U)
> >>>         U->getUnitDIE(false)->dump(llvm::outs(), U, -1u);
> >>>     }
> >>>   }
> >>>
> >>>
> >>> But things fall down if I try to uncomment the DW_FORM_ref_addr line
> >> above. The problem is that AsmPrinter really expects a full stack of
> stuff
> >> to be there and expects people to use the DwarfDebug class and all of
> its
> >> associated classes. These associated classes really want to use the "DI"
> >> objects (DICompileUnit, etc) so to create a compile unit we would need
> to
> >> create DICompileUnit object and then make a AsmPrinter/DwarfCompileUnit.
> >> That stack is pretty heavy and requires the code shown above to create
> >> many many classes just to represent the simple output we wish to emit.
> >> Another downside of the AsmPrinter method is we don't know which targets
> >> people are going to build into their binaries and thus we don't know
> which
> >> triples we will be able to use when generating DWARF info. Adrian Prantl
> >> attempted to help me get things working over here and we kept running
> into
> >> roadblocks.
> >>>
> >>> It'd be great to have more detail about the roadblocks you hit to
> better
> >> understand how bad/what the issues are.
> >>
> >> A few blocks:
> >>
> >> - DIEString doesn't support DW_FORM_string. DW_FORM_string support might
> >> have been pulled so that we never emit it from clang, but we would want
> to
> >> have a unit test that covers being able to read an inlined C string
> from a
> >> DIE. Support won't be that hard to add, but we might not want it so that
> >> people can't use it by accident and make less efficient DWARF.
> >
> > Seems to me we originally supported only DW_FORM_string, and then at some
> > point it was tossed in favor of DW_FORM_strp in order to get space
> savings
> > from string pooling.  In fact using DW_FORM_string for small strings
> would
> > save some more space (admittedly not much) and a bunch of relocations.
> > (I found data from an old experiment, in a debug build of Clang it saved
> > ~0.7MB out of a total 340MB of debug-info size, and >360K ELF
> relocations.)
>
> This is true, but it also adversely affects DWARF parsing speed as you
> will need to manually skip each C string when parsing the DIEs.
>
>
Sorry, thread got derailed a bit - the main point was that Greg wants
support for generating DW_FORM_string to test the parsing support for it
(for use in the debugger with debug info generated by other sources that do
use that form), even if LLVM never generates it.

Greg was hesitant to add support for generating it to the existing LLVM
DWARF Code (lib/CodeGen/AsmPrinter) on the risk that someone might make the
mistake of writing code in LLVM that would generate that form in its object
output. I'm pretty comfortable with (& have mentioned above in the thread)
that risk being quite low, given the way the code works there (we have one
common API for generating string forms that picks between strp and
str_index, for example - so I don't think _string would sneak in by
accident. If it were added, it'd be done so explicitly/intentionally in
that code with whatever tradeoffs appropriately considered)

I think it's reasonable to add the support in the common APIs so we can use
it to generate sample DWARF for testing the DWARF parsing APIs.


>
> >
> > I'd favor an API that passed the string down and let the DIE generator
> > (as opposed to the DWARF generator) pick the form.
>
> I have currently added a DIEInlinedString class that can be used for
> DW_FORM_string attributes.
>
> >
> >> - Asserts, asserts, asserts. As we tried to emit DWARF, we got an
> asserts
> >> in bool AsmPrinter::doInitialization(Module &M). On the first line:
> >>
> >>  MMI = getAnalysisIfAvailable<MachineModuleInfo>();
> >>
> >> This asserts if you use the AsmPrinter the way the DwarfLinker and the
> >> AsmPrinter based DwarfGen does if you call this. You must call this to
> >> generate the DebugDwarf. If you get past this by installing a Pass then
> we
> >> assert at:
> >>
> >>  GCModuleInfo *MI = getAnalysisIfAvailable<GCModuleInfo>();
> >>  assert(MI && "AsmPrinter didn't require GCModuleInfo?");
> >>
> >> If we don't have this, we don't get a DwarfDebug.
> >>
> >>>
> >>> Even if we end up adding another set of code to generate DWARF (which
> >> I'd really like to avoid) we'd want to, at some point, coalesce them
> back
> >> together. Given the goal is to try to coalesce the DWARF parsing code in
> >> LLDB and LLVM, it'd seem unfortunate if that effort just created another
> >> similar (or larger) amount of work for DWARF generation.
> >>
> >> This DWARF generator could just live in the unittests/DebugInfo/DWARF
> >> directory so it wouldn't pollute anything in LLVM it we do choose to use
> >> it.
> >>>
> >>> I wanted to pass this patch along in case someone wants to take a look
> >> at how we can possibly fix the lib/CodeGen/DwarfGenerator.cpp and
> >> lib/CodeGen/DwarfGenerator.h. The code that sets up all the required
> >> classes for the AsmPrinter method is in the DwarfGen class from
> >> lib/CodeGen/DwarfGenerator.cpp in the following function:
> >>>
> >>> bool DwarfGen::init(Triple TheTriple, StringRef OutputFilename);
> >>>
> >>> The code in this function was looted from existing DwarfLinker.cpp
> code.
> >> This functions requires a valid triple and that triple is used to
> create a
> >> lot of the classes required to make the AsmPrinter. I am not sure if any
> >> other code uses the AsmPrinter like this besides the DwarfLinker.cpp
> code
> >> and that code uses its own magic to actually link the DWARF. It does
> reuse
> >> some of the functions as I did, but the DwarfLinker doesn't use any of
> the
> >> DwarfDebug, DwarfCompileUnit or any of the classes that the
> >> compiler/assembler uses when making DWARF.
> >>>
> >>> What's the DwarfLinker code missing that you need? If that code is
> >> generating essentially arbitrary DWARF, what's blocking using the same
> >> technique for generating DWARF for parsing tests?
> >>
> >> They don't use any of the DwarfDebug, DwarfCompileUnit classes. They
> also
> >> don't use any of the DI classes when making up the debug info. So both
> the
> >> DWARF linker and the generator have similar needs: make DWARF that isn't
> >> tied too closely to the clang internal classes and DI classes.
> >>>
> >>> The amount of work required for refactoring the AsmPrinter exceeds the
> >> time I am going to have, but I would still like to have DWARF API
> testing
> >> in the unit tests.
> >>>
> >>> So my question is if anyone would have objections to using the stand
> >> along DWARF generator in unittests/DebugInfo/DWARF until we can later
> get
> >> the YAML tools to be able to produce DWARF and we can switch to testing
> >> the DWARF data that way? Chris Bieneman has expressed interest in
> getting
> >> a DWARF/YAML layer going.
> >>>
> >>> Those tools would still want to use pretty similar (conceptually)
> >> abstractions to LLVM's codegen and llvm-dsymutil. I'd still strongly
> >> prefer to generalize/keep common APIs here - or better understand why
> it's
> >> not practical now (& what it will take/how we make sure we have a plan
> and
> >> resources to get there eventually).
> >>>
> >>> My reasoning is:
> >>> - I want to be able to test DWARF APIs we have to ensure they work
> >> correctly as there are no Dwarf API tests right now. I will be adding
> code
> >> that changes many things in the DWARF parser and it will be essential to
> >> verify that there are no regressions in the DWARF APIs.
> >>> - Not sure which targets would be built into LLVM so it might be hard
> to
> >> write tests that cover 32/64 bit addresses and all the variants if we
> have
> >> to do things legally via AsmPrinter and valid targets
> >>>
> >>> Seems like it might be plausible to refactor out whatever features of
> >> the AsmPrinter these APIs require (so we just harvest that data out of
> >> AsmPrinter and pass it down in a struct, say - so that other users can
> >> pass their own struct without needing an AsmPrinter). Though, again,
> >> interested to know how dsymutil is working in these situations.
> >>
> >> I can try that method if indeed the only places that use the DwarfDebug
> >> are the DW_FORM_ref_addr and location lists. I'll let you know how that
> >> goes.
> >>>
> >>> - Not enough time to modify AsmPrinter to not require the full
> DebugInfo
> >> stack and the classes that it uses (llvm::DwarfCompileUnit which must
> use
> >> llvm::DICompileUnit, llvm::DIE class which uses many local classes that
> >> all depend on  the full DwarfDebug stack).
> >>>
> >>> Will you have time at some later date to come back and revisit this?
> >> It's understandable that we may choose to incur short term technical
> debt
> >> with an understanding that it will be paid off in some timely manner.
> It'd
> >> be less desirable if there's no such plan/possibility and we incur a
> >> fairly clear case of technical debt (redundant DWARF generation
> libraries
> >> - especially when this effort is to remove a redundant DWARF parser).
> >>
> >> Not sure anyone else will need to generate DWARF manually. The two
> clients
> >> currently are the DWARF unittests and the DwarfLinker. The DwarfLinker
> >> worked around these issues. If the AsmPrinter wasn't such an integral
> part
> >> of the entire compiler stack, I could take a stab at refactoring it,
> but I
> >> don't believe I am the right person to do this at this point as I have
> no
> >> experience or knowledge of the various ways that this class is used, or
> >> how it interacts with other support classes (DwarfDebug, and many many
> >> other classes).
> >>
> >> Things that still worry me:
> >> - not being able to generate DWARF for 32/64 if targets are missing
> >
> > You mean DWARF-32 and DWARF-64 formats?  LLVM doesn't do DWARF-64.
> > If you mean 64-bit target-machine addresses, I guess I don't understand
> > the problem.  If you have target-dependent tests, then they only work
> > when the right targets are there.  This is extremely common and I'm
> > not clear why it would be a problem for the DWARF tests.
>
> I wasn't aware that there were target-dependent tests. Do you know of one
> in the unittest directory you can point me to? I did mean 32 bit address
> target, versus 64 bit address targets. I am not sure how I can test 4 and 8
> byte addresses reliably. What triple to I use in the unittest? I can't
> assume x86_64 as we may have been built on a 32 bit ARM system with only
> the 32 bit ARM targets.
> >
> >> - DIEString not supporting DW_FORM_string. I can add support, but I
> don't
> >> know if we want it as if we add it people might start using it.
> >
> > See above. If the API picked the form this would not be a concern.
>
> For DWARF parsing speed I still like the DW_FORM_strp.
>
>
> FWIW I'm with Greg here. I don't find that the "inlined small strings"
> optimization is really worth it for size, but could be convinced to add it.
>
> -eric
>
>
>
> >
> >> - hacking around asserts by constructing classes and copying code from
> >> places that properly use the AsmPrinter that way it is supposed to be
> used
> >> so that we can use it in a way that it wasn't designed to be used.
> >>
> >>>
> >>> I made a large effort to try and get things working with the
> AsmPrinter,
> >> so I wanted everyone to know that I tried to get that solution working.
> >> Let me know what you anyone thinks.
> >>>
> >>> Greg Clayton
> >
> > --paulr
> >
> >>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161118/02cb5184/attachment-0001.html>


More information about the llvm-dev mailing list