[PATCH] Have clang list the imported modules in the debug info

David Blaikie dblaikie at gmail.com
Wed Mar 18 16:02:49 PDT 2015


On Wed, Mar 18, 2015 at 3:50 PM, Adrian Prantl <aprantl at apple.com> wrote:

>
> On Mar 17, 2015, at 6:44 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Tue, Mar 17, 2015 at 3:47 PM, Adrian Prantl <aprantl at apple.com> wrote:
>
>>
>> > On Mar 17, 2015, at 10:03 AM, Greg Clayton <gclayton at apple.com> wrote:
>> >
>> >
>> >> On Mar 17, 2015, at 9:46 AM, David Blaikie <dblaikie at gmail.com> wrote:
>> >>
>> >>
>> >>
>> >> On Tue, Mar 17, 2015 at 9:42 AM, Greg Clayton <gclayton at apple.com>
>> wrote:
>> >>
>> >>> On Mar 16, 2015, at 6:47 PM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Mar 16, 2015 at 5:14 PM, Adrian Prantl <aprantl at apple.com>
>> wrote:
>> >>>
>> >>> Thanks for the explanation David, I missed that it is entirely the
>> linker's (or some dwarf post-processor's) responsibility to find the module
>> files and link in the debug info from the .pcm files, so debugger doesn’t
>> notice a difference.
>> >>>
>> >>> I think there's still some confusion here. Sorry if I'm rehashing
>> something, but I'll try to explain how this all works.
>> >>>
>> >>> Normal split DWARF:
>> >>>
>> >>> Compiler generates two files: .o and .dwo.
>> >>> .dwo has static, non-relocatable debug info.
>> >>> .o has a skeleton compile_unit that has the name of the .dwo file and
>> a hash to verify that the .dwo file isn't stale when the debugger reads it.
>> >>> The .o files are all linked together, the .dwo files stay where they
>> are.
>> >>> The debugger reads the linked executable, finds the skeleton
>> compile_units contained therein, and find/loads the .dwo files
>> >>>
>> >>> The scenario I have in mind for module debug info is this:
>> >>> Module is compiled as an object file with debug info (this file is
>> actually a .dwo file, even if it has some other extension - it has the
>> non-relocatable debug info in it)
>> >>> .o file has a comdat'd skeleton compile_unit describing the
>> .dwo/module file
>> >>> <from here on no extra work is required, the linker and debugger just
>> act as normal>
>> >>> The .o files are linked together, the skeleton compile_units get
>> deduplicated by the linker (comdat sections)
>> >>
>> >> One issue I can think of is we will need to figure out a way to make
>> COMDAT work with mach-o. COMDAT requires large number of sections and
>> mach-o can only have 255.
>> >>
>> >> Ah, fair enough - how does MachO handle inline functions (the most
>> common use of comdat) currently, then?
>> >
>> > Currently mach-o relies on symbols in the symbol table being marked as
>> weak and I believe the data for these symbols are in special sections that
>> are marked as containing items that can be coalesced.
>> >
>> That’s not necessarily an issue that needs to be solved on Darwin, or am
>> I maybe missing something? The linker leaves all debug info in the .o (as
>> it currently does) and llvm-dsymutil is resolving all the external module
>> type references while creating the .dSYM bundle.
>>
>
> Yeah, with a debug aware linker (or in the case of dsymutil, a debug-only
> linker) you would just know that since you're looking at object files,
> module references will be redundant across objects and should be
> deduplicated (by the dwo hash, most likely).
>
> If you're not teaching your debugger to read modules, and want to link the
> debug info in from the .dwos - at that point you can probably drop the
> skeleton stuff entirely (you'd still need to teach your debugger about .dwo
> sections and some of the esoteric things there - like str_index and the
> extra/special line table just for file names (decl_file, etc, uses this))
> and just put the contents of the module debug info straight in the dsym.
> It'd be a bit weird, but do-able without too much work, I'd imagine. You
> could move them back into the original sections, if you wanted to avoid the
> weird .dwo +non-.dwo sections together... *shrug* not sure what exactly
> you'd want there.
>
>
> My plan was to have -gmodules to behave like the latter variant
> unless -gsplit-dwarf is also present; this way there wouldn't be any weird
> Darwin-specific code paths.
>

Not sure I quite follow (mostly my fault given the rambling paragraph up
there) - given the lack of a dsymutil-like tool on other platforms as part
of the common tool path for debug info, I'm not sure module debug info
without split dwarf is viable in that world. There's no tool to read these
extra files at any point.

I suppose we could be creating one giant comdat for the module's debug info
(no skeleton unit, no distinct type unit comdats, just one big comdat). But
we'd probably want/need a tool to do the merging at compile time (like the
objcopy feature for split-dwarf, but in reverse - we'd compile, then run a
tool to smoosh all the comdats from the modules onto the object we just
generated). It wouldn't provide much in the way of space savings, a little
less stress on the linker (fewer comdats to handle), etc. Not sure if
there's a default mode of objcopy that would cope with this straight out,
or whether we'd need a new feature there (which wouldn't be a priority for
Google to implement, since we use fission, nor a priority for you to
implement since you have dsymutil, etc - so I'm not sure anyone would
bother)

Long story short: maybe just error on -gmodules if -gsplit-dwarf isn't
specified or the platform isn't darwin? (& if it's darwin, dsymutil could
read the module skeletons to find which modules to link into the .dSYM?)



>
> -- adrian
>
>
> - David
>
>
>>
>> -- adrian
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20150318/2564233b/attachment.html>


More information about the cfe-commits mailing list