[PATCH] Have clang list the imported modules in the debug info

David Blaikie dblaikie at gmail.com
Fri May 1 16:55:31 PDT 2015


On Fri, May 1, 2015 at 4:39 PM, Adrian Prantl <aprantl at apple.com> wrote:

>
> > On May 1, 2015, at 10:01 AM, David Blaikie <dblaikie at gmail.com> wrote:
> >
> >
> >
> > On Fri, May 1, 2015 at 9:52 AM, Adrian Prantl <aprantl at apple.com> wrote:
> >>
> >>> On May 1, 2015, at 9:23 AM, David Blaikie <dblaikie at gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> On Thu, Apr 30, 2015 at 5:21 PM, Adrian Prantl <aprantl at apple.com>
> wrote:
> >>>
> >>> > On Apr 30, 2015, at 4:55 PM, David Blaikie <dblaikie at gmail.com>
> wrote:
> >>> >
> >>> >
> >>> >
> >>> > On Thu, Apr 30, 2015 at 4:31 PM, Adrian Prantl <aprantl at apple.com>
> wrote:
> >>> >>
> >>> >> > On Mar 19, 2015, at 5:37 PM, David Blaikie <dblaikie at gmail.com>
> wrote:
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > On Thu, Mar 19, 2015 at 5:24 PM, Adrian Prantl <aprantl at apple.com>
> wrote:
> >>> >> >>
> >>> >> >> > On Mar 16, 2015, at 2:55 PM, David Blaikie <dblaikie at gmail.com>
> wrote:
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >> On Mon, Mar 16, 2015 at 2:45 PM, Robinson, Paul <
> Paul_Robinson at playstation.sony.com> wrote:
> >>> >> >> > Beyond the above (that using a new tag would mean this would
> go from 'free' to 'not free' for GDB) having a new top level tag is pretty
> substantial (we only have two at the moment, and with our talk of modules
> being a "bag of dwarf" might go back to having one top level tag? (it's not
> clear to me from DWARF4 whether DW_TAG_module is currently a top-level tag,
> I don't think it is?)
> >>> >> >> >
> >>> >> >> >> The .debug_info section contains one or more compilation
> units, partial units, or in DWARF 5, type units.  DW_TAG_module isn't a
> unit, if you want it to be handled independently then it would need to be
> wrapped in a DW_TAG_partial_unit.  You would probably then use
> DW_TAG_imported_unit to refer to it, rather than DW_TAG_imported_module.
> >>> >> >> >>
> >>> >> >> >
> >>> >> >> > This makes a fair bit of sense - though the terminology's
> never going to quite line up with modules, I suspect, and this would still
> require modifying existing consumers (well, GDB) that can handle
> split-dwarf today, I suspect (not sure how it'd handle partial_unit - maybe
> that does work? - and still don't know how existing consumers would handle
> imported_unit either - could be worth some testing, as it sounds sort of
> right out of several less right options).
> >>> >> >>
> >>> >> >> Thanks for all the input so far!
> >>> >> >> To concretize this end of the discussion up let’s sketch some
> dwarf of how this could look like in practice.
> >>> >> >>
> >>> >> >> ELF (no imports)
> >>> >> >> ----------------
> >>> >> >>
> >>> >> >> On ELF or COFF a foo.c referencing types from the module
> Foundation looks like this:
> >>> >> >>
> >>> >> >> .debug_info:
> >>> >> >>   DW_TAG_compile_unit
> >>> >> >>     DW_AT_name(“foo.c”)
> >>> >> >>
> >>> >> >> .debug_info.dwo (on ELF: group 0x1234ABCDE, comdat)
> >>> >> >>   DW_TAG_partial_unit
> >>> >> >
> >>> >> > For now I'd suggest we use compile_unit - that way it'll just
> work with existing split-dwarf consumers. We can see about standardizing a
> top-level DW_TAG_module or using DW_TAG_partial_unit here later, perhaps?
> I'm not sure.
> >>> >> >
> >>> >> >>
>  DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/Foundation.pcm”)
> >>> >> >>     DW_AT_dwo_id(“0x1234ABCDE”)
> >>> >> >>
> >>> >> >>
> >>> >> >> Side question: Is .debug_info.dwo the right section to put the
> module skeleton in, or should it be a .debug_info section like normal
> fission skeletons?
> >>> >> >
> >>> >> > Skeletons go in .debug_info, the dwo sections are just for the
> .dwo file (or the module file, in our new case - the extension isn't
> actually important).
> >>> >> >
> >>> >> > It might be worth you compiling an example or two of split-dwarf
> to see how this all works hands-on.
> >>> >> >
> >>> >> >> Mach-O (no comdat, no imports)
> >>> >> >> ------------------------------
> >>> >> >>
> >>> >> >> Mach-O doesn’t do comdat, so with -split-dwarf=Disable (not sure
> if that option is the best discriminator) this could look like:
> >>> >> >>
> >>> >> >> .debug_info:
> >>> >> >>   DW_TAG_compile_unit
> >>> >> >>     DW_AT_name(“foo.c”)
> >>> >> >>   DW_TAG_partial_unit
> >>> >> >>
>  DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/Foundation.pcm”)
> >>> >> >>     DW_AT_dwo_id(“0x1234ABCDE”)
> >>> >> >>
> >>> >> >>
> >>> >> >> Mach-O (no comdat, with imports)
> >>> >> >> ------------------------------
> >>> >> >>
> >>> >> >> If we add the module import information to this, we get:
> >>> >> >>
> >>> >> >> .debug_info:
> >>> >> >>   DW_TAG_compile_unit
> >>> >> >>     DW_AT_name(“foo.c”)
> >>> >> >>     DW_TAG_imported_module
> >>> >> >>       DW_AT_import(DW_FORM_ref_addr 0x10)
> >>> >> >
> >>> >> > Since we got went down the tangent of explaining split-dwarf many
> emails ago, I've forgotten (& can't readily find) what we were discussing
> about what ways the imported_module could work.
> >>> >> >
> >>> >> > The simplest representation I can think of would be to have it
> reference, by signature, the module unit (whatever tag it uses) -
> DW_FORM_ref_sig8, seems the simplest thing to do.
> >>> >> >
> >>> >> >>
> >>> >> >>   DW_TAG_partial_unit
> >>> >> >>
>  DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/Foundation.pcm”)
> >>> >> >>     DW_AT_dwo_id(“0x1234ABCDE”)
> >>> >> >>
> >>> >> >> 0x10:
> >>> >> >
> >>> >> > This is inside the partial unit? I figured we'd just put these
> attributes on the top level (compile_unit, or whatever it might be later) -
> potentially conditionalized on platform, sure.
> >>> >> >
> >>> >> >>     DW_TAG_module
> >>> >> >>       DW_AT_name(“Foundation”)
> >>> >> >>       DW_AT_LLVM_sysroot(“/“)
> >>> >> >>       DW_AT_LLVM_include_dir(“”)
> >>> >> >>       DW_AT_LLVM_macros(“-DNDEBUG”)
> >>> >> >>       ...
> >>> >> >>
> >>> >> >>
> >>> >> >> ELF (comdat, with imports)
> >>> >> >> --------------------------
> >>> >> >>
> >>> >> >> But now let’s go back to ELF. Since the skeleton with the
> partial unit is comdat'd, I assume that this breaks the FORM_ref_addr used
> in the DW_AT_import. We could reuse the module hash as a signature for the
> module:
> >>> >> >>
> >>> >> >> .debug_info:
> >>> >> >>   DW_TAG_compile_unit
> >>> >> >>     DW_AT_name(“foo.c”)
> >>> >> >>     DW_TAG_imported_module
> >>> >> >>       DW_AT_import(DW_FORM_ref_addr 0x1234ABCDE)
> >>> >> >
> >>> >> > Still only really need these imported_modules for lldb, right?
> I'd consider having them off-by-default for non-darwin, but I'm not
> strictly wedded to that notion. Wouldn't mind seeing size impact numbers of
> some kind - if it's really fractional % increase & GDB doesn't fall over
> when it sees them (in whatever FORM/tag/etc we decide on) then that's not
> the end of the world.
> >>> >> >
> >>> >> > Just seems nice if the default mode is the nice, standard,
> split-dwarf output. Doesn't need anything fancy.
> >>> >> >
> >>> >> >
> >>> >> >> .debug_info.dwo (group 0x1234ABCDE, comdat)
> >>> >> >>   DW_TAG_partial_unit
> >>> >> >>
>  DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/Foundation.pcm”)
> >>> >> >>     DW_AT_dwo_id(“0x1234ABCDE”)
> >>> >> >>
> >>> >> >>     DW_TAG_module
> >>> >> >>       DW_AT_signature(“0x1234ABCDE”)
> >>> >> >>       DW_AT_name(“Foundation”)
> >>> >> >
> >>> >> >
> >>> >> > The thing you haven't covered is the actual .dwo sections
> (.debug_info.dwo (we'll probably need a simple stub compile_unit to make
> this correct split-dwarf) and .debug_types.dwo being important - but all
> the supporting .dwo sections will be necessary) that go in the module file.
> >>> >> >
> >>> >> >> This is bending the definition of DW_AT_signature, but I guess
> it could be made to work. Or we could say that for now, users have to
> choose between the comdat optimization and having the module imports
> recorded in Dwarf, since GDB wouldn’t know what to do with that information
> anyway.
> >>> >>
> >>> >> Sorry for the long delay. Here’s a more complete example that
> should include all the suggestions made so far. For context I also included
> external type references in the example although admittedly this is a bit
> out of scope for this thread:
> >>> >>
> >>> >> ELF (typeunits, comdats, with imports)
> >>> >> --------------------------------------
> >>> >>
> >>> >> On ELF or COFF a bar.c referencing type Foo from the module FooLib
> looks like this:
> >>> >>
> >>> >> bar.o
> >>> >> ~~~~~
> >>> >>
> >>> >> // To keep this example focussed/readable, I'm assuming that bar.o
> itself was not compiled with fission.
> >>> >> .debug_info:
> >>> >>   DW_TAG_compile_unit
> >>> >>     DW_AT_name(“bar.c”)
> >>> >>     ...
> >>> >>
> >>> >>     DW_TAG_imported_module // <- This could be optional on ELF.
> >>> >>       DW_AT_import [DW_FORM_ref_sig8] (0xABCD1234)
> >>> >>
> >>> >>     DW_TAG_variable
> >>> >>       DW_AT_name(“MyFoo”)
> >>> >>       DW_AT_type [DW_FORM_ref4] 0x20
> >>> >> 0x20:
> >>> >>     DW_TAG_structure_type
> >>> >>       DW_AT_declaration (true)
> >>> >>       DW_AT_signature [DW_FORM_ref_sig8] (0xF00)
> >>> >>
> >>> >>
> >>> >> // Split DWARF skeleton CU for the module Foo.
> >>> >>   DW_TAG_compile_unit
> >>> >>
>  DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/FooLib-XYZ.pcm”)
> >>> >>     DW_AT_dwo_id(“0xFEDB9876”)
> >>> >>     ...
> >>> >>
> >>> >> // Comdat’d partial unit containing the optional module descriptor.
> >>> >> .debug_info, group 0xABCD1234, comdat
> >>> >>   DW_TAG_partial_unit
> >>> >>     DW_TAG_module
> >>> >>       DW_AT_name(“FooLib”)
> >>> >>       DW_AT_LLVM_sysroot(“/“)
> >>> >>       DW_AT_LLVM_include_dirs(“-I/path”)
> >>> >>       DW_AT_LLVM_macros(“-DNDEBUG”)
> >>> >>       ...
> >>> >>
> >>> >> FooLib-XYZ.pcm
> >>> >> ~~~~~~~~~~~~~~
> >>> >>
> >>> >> .debug_info.dwo
> >>> >>   DW_TAG_compile_unit
> >>> >>     DW_AT_dwo_id(“0xFEDB9876”)
> >>> >>     ...
> >>> >>
> >>> >> // Type unit for the type Foo.
> >>> >> .debug_types.dwo, group 0xF00, comdat
> >>> >>   DW_TAG_type_unit
> >>> >>     DW_TAG_structure_type
> >>> >>       DW_AT_name (“Foo”)
> >>> >>       ...
> >>> >>
> >>> >>
> >>> >> I think it awkward to have both the skeleton compile_unit in
> .debug_info and the partial_unit containing the TAG_module. Personally I’d
> prefer putting the TAG_module into the skeleton CU and then just refer to
> it via a FORM_ref_addr; but if we want to put the TAG_module into a comdat
> section, it looks like that’s what’s necessary.
> >>> >
> >>> > It's been a while & I've probably lost all the context, but I think
> my original theory was to have the skeleton compile_unit be comdat'd so
> they'd deduplicate on linking (so we'd only have one reference to the
> module.dwo in the linked binary). I don't recall there being a need for a
> separate partial_unit - I imagine we'd just put the LLDB/LLVM extension
> attributes on the skeleton compile_unit and expect debuggers that didn't
> understand them, to ignore them.
> >>> >
> >>> > Was there some reason this didn't work/make sense? Because you need
> a DW_TAG_module to import with DW_TAG_imported_module?
> >>> Using DW_TAG_module was the best practice that was recommended on
> dwarf-discuss.
> >>>
> >>> Did they have any ideas on how to reference it without duplicating it
> in every CU?
> >>
> >> We didn’t touch the deduplication issue.
> >>
> >>> Once we've got the "Bag O Dwarf" stuff (rather than the narrower type
> units) this would be easier - (I suppose we could do a partial
> solution/abuse of type units - use a type unit header (perhaps with Eric's
> merged type/compile unit work) and a DW_FORM_ref_sig8 value for the
> DW_AT_module in the DW_TAG_imported_module.
> >>>
> >>> Though I suppose if we're going to have DW_TAG_imported_module in
> every CU that references a module, it might not be that big of a deal to
> include the DW_TAG_module itself there too... while I don't care about this
> scheme immediately, Google's growing LLDB investment in various platforms,
> so I am vaguely concerned about getting this right & it's not immediately
> obvious to me what that right answer is.
> >>
> >> Maybe the best path forward is to stage this by initially putting the
> DW_TAG_module into the main CU and leave the deduplication as an
> optimization to be implemented once the bag’o dwarf is more fleshed out.
> This way we won’t do anything that would confuse consumers (assuming they
> ignore unknown tags) and the extra overhead is likely not even going to be
> noticeable, since all the string attributes inside the TAG_module can
> already be deduplicated by traditional means.
> >
> > Perhaps. I'd still like to think through/document what this looks like a
> bit more. Where the data ends up, what it's used for, etc. Sorry to draw
> this out.
> >
> > :/ *ponders*
>
>
> Let’s construct this:
>
> The most straightforward representation is to not unique the TAG_module
> and place it into the main CU.
>
> bar.o
> ~~~~~
>
> .debug_info:
>   DW_TAG_compile_unit
>     ...
>     DW_TAG_imported_module
>       DW_AT_import [DW_FORM_ref4] (0x20)
> 0x20:
>     DW_TAG_module
>       DW_AT_name(“FooLib”)
>       DW_AT_LLVM_sysroot(“/“)
>       DW_AT_LLVM_include_dirs(“-I/path”)
>       DW_AT_LLVM_macros(“-DNDEBUG”)
>

Might as well put all these LLVM attributes on the skeleton CU, though - so
they can be deduplicated (& just put the dwo_id in this module somewhere,
perhaps just using the DW_AT_dwo_id attribute - possibly that's the only
attribute the DW_TAG_module would need, ideally). Unless we need to
consider the submodule issue (in which case the skeleton unit would
reference the whole module but the submodules would reference/describe the
respective submodules?)?


>       ...
>
> // Split DWARF skeleton, comdat'd.
> .debug_info, group 0xFEDB9876, comdat
>   DW_TAG_compile_unit
>
> DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/FooLib-XYZ.pcm”)
>     DW_AT_dwo_id(“0xFEDB9876”)
>     ...
>
> On Mach-O the split DWARF skeleton would not be a comdat’d, but
> llvm-dsymutil can just ignore it.
>
>
> If we want to dedup the TAG_module we need to refer to it via signature.
> This means we need to wrap it in a type_unit or a DWARF5 TAG_type_unit. We
> might as well throw it in with the skeleton CU.
>
> .debug_info:
>   DW_TAG_compile_unit
>     ...
>     DW_TAG_imported_module
>       DW_AT_import [DW_FORM_ref_sig8] (0xABCD1234)
>
> // Split DWARF skeleton, comdat'd.
> .debug_info, group 0xFEDB9876, comdat
>   DW_TAG_compile_unit
>
> DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/FooLib-XYZ.pcm”)
>     DW_AT_dwo_id(“0xFEDB9876”)
>     ...
>     DW_TAG_type_unit (signature: 0xABCD1234)
>

Can't really put a type_unit inside a compile_unit - it'd need to be
top-level with an appropriate type unit header, etc. & then we'd need two
different units/headers, could still comdat them, but it's a weird abuse of
type units & would probably confuse consumers. I don't know whether that's
worth the effort.


>       DW_TAG_module
>         DW_AT_name(“FooLib”)
>         DW_AT_LLVM_sysroot(“/“)
>         DW_AT_LLVM_include_dirs(“-I/path”)
>         DW_AT_LLVM_macros(“-DNDEBUG”)
>         ...
>
> Now that raises the question about what happens with multiple modules
> within one PCM.


Is the right term "submodule"? it's sort of confusing to talk about
multiple modules within a pcm.


> Assuming that the ELF linker is linking and deduping all the non-.dwo
> sections, we may loose some of the TAG_modules (if not every CU imports all
> submodules) in the binary, but that wouldn’t matter because the consumer
> would find all TAG_modules by signature in the .pcm


Is there any reason we need to reference the submodules individually,
rather than just reference the whole module (& have just a single, whole
module in the pcm)?


> file referred to by whichever skeleton CU makes it into the binary:
>
> FooLib-XYZ.pcm
> ~~~~~~~~~~~~~~
>
> .debug_info.dwo
>  DW_TAG_compile_unit
>    DW_AT_dwo_id(“0xFEDB9876”)
>    ...
>
>  DW_TAG_type_unit (signature: 0xABCD1234)
>    DW_TAG_module
>      DW_AT_name(“FooLib”)
>      ...
>  DW_TAG_type_unit (signature: 0xCDEF3456)
>    DW_TAG_module
>      DW_AT_name(“FooLib”)
>      DW_TAG_module
>        DW_AT_name(“SubFoo”)
>        ...
>
> So.. this should work as long as nobody points out that a module isn’t
> really a type.
>

Yeah, probably worth waiting for "Bag O DWARF".

For now, as you mentioned earlier, maybe just putting the imported_module
and the module into the compile_unit when tuning for LLDB (so Darwin by
default, and anywhere else where someone tunes for LLDB in the future) &
leave them out otherwise.

Could you remind me why LLDB wants to know which modules are referenced
from a CU? (rather than just all the modules used by a program overall?)


>
>
>
> On Macho-O, in the absence of comdats, we have:
>
> bar.o
> ~~~~~
>
> .debug_info:
>   DW_TAG_compile_unit
>     ...
>     DW_TAG_imported_module
>       DW_AT_import [DW_FORM_ref4] (0x20)
>
>     DW_TAG_module           // uniqued by dsymutil.
>       DW_AT_name(“FooLib”)
>       DW_AT_LLVM_sysroot(“/“)
>       DW_AT_LLVM_include_dirs(“-I/path”)
>       DW_AT_LLVM_macros(“-DNDEBUG”)
>       ...
>
> // Split DWARF skeleton, thrown out by dsymutil.
>

Thrown out? Because it's going to read everything in from the module and
merge it in to a single linked debug info blob, I take it?


> .debug_info, group 0xFEDB9876, comdat
>   DW_TAG_compile_unit
>
> DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/FooLib-XYZ.pcm”)
>     DW_AT_dwo_id(“0xFEDB9876”)
>     ...
>
> FooLib-XYZ.pcm
> ~~~~~~~~~~~~~~
>
> .debug_info:
>   DW_TAG_compile_unit
>     DW_AT_dwo_id(“0xFEDB9876”)
>     ...
>
>     DW_TAG_module
>       DW_AT_name(“FooLib”)
>       DW_TAG_module
>         DW_AT_name(“SubFoo”)
>         ...
>
> -- adrian
>
> >
> >>
> >>>
> >>> > If it turns out that's the right way to get a target for the
> imported_module, we could put both the skeleton CU and the partial unit in
> the same comdat and dedup them both together.
> >>>
> >>> I think this works as long as we only have one TAG_module per .pcm
> file (because we need to refer to it via signature).
> >>>
> >>> Not quite following here - why would we have more than one module per
> pcm - a pcm is a module, right?
> >>
> >> Clang modules may have submodules and a compile unit could import two
> submodules that live in the same .pcm file. For example on Darwin there is
> a module Darwin.pcm that contains a submodule “C" that contains the
> submodule “stdio".
> >
> > OK, so this bit's relevant to your use case in LLDB of loading the right
> things for the right context, but not relevant to the context-less
> debuggers like GDB that will just treat everything as one big namespace
> (except for file-local things, etc). So it's important for your imported
> modules but not for the basic Fission style debug reference.
> >
> > Well, maybe - I'm not sure what you're picturing in terms of the DWARF
> in the module for submodules? If you want that granularity we'll have to
> talk about how to split the DWARF in the module into chunks per submodule?
> >
> >>
> >>>
> >>> But if we don’t mind having duplicate dwo_* references in the same .o
> file this would also work with more than one TAG_module (or submodules).
> >>>
> >>>
> >>> .debug_info:
> >>>  DW_TAG_compile_unit
> >>>    DW_AT_name(“bar.c”)
> >>>    ...
> >>>
> >>>    DW_TAG_imported_module // <- This could be optional on ELF.
> >>>      DW_AT_import [DW_FORM_ref_sig8] (0xFEDB9876)
> >>>
> >>>    ...
> >>>
> >>> // Comdat’d split DWARF skeleton CU for the module Foo.
> >>> .debug_info, group 0xFEDB9876, comdat
> >>>  DW_TAG_compile_unit
> >>>
> DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/FooLib-XYZ.pcm”)
> >>>    DW_AT_dwo_id(“0xFEDB9876”)
> >>>    ...
> >>>
> >>>    DW_TAG_module
> >>>      DW_AT_name(“FooLib”)
> >>>      DW_AT_LLVM_sysroot(“/“)
> >>>      DW_AT_LLVM_include_dirs(“-I/path”)
> >>>      DW_AT_LLVM_macros(“-DNDEBUG”)
> >>>      ...
> >>>
> >>>
> >>> >
> >>> > But this gets into complicated territory when the original binary is
> built with fission... which will be relevant for modules on ELF with LLDB.
> Hmm, maybe it's not too complicated - the partial_unit would end up in the
> .dwo file (maybe we'd have to teach the .dwo file to deduplicate these too
> - the same way it does for type units... - might require a new header to
> include the hash, etc :/)... would be tricky to have the dwp tool resolve
> the relocations to these things. Cross-unit references as you've got there
> aren't something that every DWARF consumer is totally cool with, I don't
> think?
> >>>
> >>> Ah. I thought the deduplication happens because all ELF sections
> sharing the same group are uniqued based on the group id.
> >>>
> >>> COMDAT groups deduplicate for a normal non-fission build, but fission
> linking doesn't require the .dwo file to use/contain COMDATs as it uses a
> DWARF-aware tool (so you don't bother putting the type units in COMDAT
> groups, for example - the fission linker knows how to parse debug_types,
> find the type unit headers and their hashes and deduplicates them that way).
> >>
> >> Ok that makes sense.
> >>
> >> -- adrian
> >>
> >>>
> >>> It certainly would be nice if we could avoid introducing a new
> .debug_info header...
> >>>
> >>> >
> >>> > Sort of inclined to have the imported module stuff just for LLDB,
> but I've lost some of the context for that in the ensuing weeks.
> >>>
> >>> -- adrian
> >>>
> >>> >
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> MachO (no typeunits, no comdats, with imports)
> >>> >> ----------------------------------------------
> >>> >>
> >>> >> Since we don’t have comdat sections in Mach-O and we don’t have the
> tool support for type units, the way that external types can be referenced
> necessarily needs to be a bit different. The design that Greg and I came up
> with for Mach-O relies on llvm-dsymutil to fix up the DWARF for
> non-module-aware consumers. Just as ELF DWARF consumers need not be able to
> tell the difference between module debugging an split DWARF, on Mach-O the
> .dSYM bundle generated by llvm-dsymutil looks like traditional DWARF.
> >>> >>
> >>> >> There are three differences in the DWARF output that make this
> possible:
> >>> >>   - Refer to external types by UID rather than by type signature.
> >>> >>     (This doubles as the key that allows a debugger to look import
> the type
> >>> >>      directly from the AST and protects us against hash collisions)
> >>> >>   - Add an index to the .o file that maps UID -> module file.
> >>> >>     (Fast lookup + UIDs for C and ObjC are only unique within a
> module)
> >>> >>   - Add an entry for each type’s UID to the types accelerator table.
> >>> >>     (Fast lookup)
> >>> >>
> >>> >> bar.o
> >>> >> ~~~~~
> >>> >>
> >>> >> .debug_info:
> >>> >>   DW_TAG_compile_unit
> >>> >>     DW_AT_name(“bar.c”)
> >>> >>     DW_TAG_imported_module
> >>> >>       DW_AT_import(DW_FORM_ref_addr 0x40)
> >>> >>
> >>> >>     DW_TAG_variable
> >>> >>       DW_AT_name(“MyFoo”)
> >>> >>       DW_AT_type [DW_FORM_strp] (“_ZTS3Foo”)  // We could use a
> custom FORM here
> >>> >>
> >>> >>   // Skeleton unit.
> >>> >>   DW_TAG_compile_unit
> >>> >>
>  DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/FooLib-XYZ.pcm”)
> >>> >>     DW_AT_dwo_id(“0xFEDB9876”)
> >>> >>     ...
> >>> >> 0x40:
> >>> >>     DW_TAG_module
> >>> >>       DW_AT_name(“FooLib”)
> >>> >>       DW_AT_LLVM_sysroot(“/“)
> >>> >>       DW_AT_LLVM_include_dirs(“-I/path”)
> >>> >>       DW_AT_LLVM_macros(“-DNDEBUG”)
> >>> >>
> >>> >> // This index uses the usual accelerator table format.
> >>> >> .apple_exttypes:
> >>> >> { “_ZTS3Foo” => debug_str offset of
> ”/tmp/org.llvm.clang/ModuleCache/1234ABCDE/FooLib-XYZ.pcm” }
> >>> >>
> >>> >> FooLib-XYZ.pcm
> >>> >> ~~~~~~~~~~~~~~
> >>> >>
> >>> >> .debug_info
> >>> >>   DW_TAG_compile_unit
> >>> >>     DW_AT_dwo_id(“0xFEDB9876”)
> >>> >>
> >>> >> 0x80:
> >>> >>   DW_TAG_structure_type
> >>> >>     DW_AT_name (“Foo”)
> >>> >>     DW_AT_signature
> >>> >>     ...
> >>> >>
> >>> >> // In addition to the entry for “Foo”, there is also an entry for
> the type’s UID “_ZTS3Foo” pointing to the type definition DIE.
> >>> >> .apple_types
> >>> >> { “Foo” => 0x80 }
> >>> >> { “_ZTS3Foo” => 0x80 }
> >>> >>
> >>> >>
> >>> >>
> >>> >> When the debug info linker (llvm-dsymutil) is run, it first pulls
> in the .debug_info section from the clang module and fixes up all the
> DW_FORM_strp external type references by turning them into a
> DW_FORM_ref_addr that references the type in the DW_TAG_compile_unit pulled
> in from the module. To find the correct type DIE it looks up the UID in the
> .apple_exttypes index, finds the module, looks up the UID in the regular
> .apple_types accelerator table and replaces the temporary DW_FROM_strp with
> a DW_FORM_ref_addr (which incidentally takes up the same amount of space in
> the DIE).
> >>> >>
> >>> >>
> >>> >> Thoughts?
> >>> >> --
> >>> >> adrian
> >>> >>
> >>> >
> >>>
> >>
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20150501/96148bde/attachment.html>


More information about the cfe-commits mailing list