[cfe-dev] [PATCH] Wrap clang modules inside Mach-O/ELF/COFF containers

Tue Jan 27 09:29:32 PST 2015

> On Jan 26, 2015, at 5:34 PM, Richard Smith <richard at metafoo.co.uk> wrote:
> 
> On Mon, Jan 26, 2015 at 4:05 PM, Adrian Prantl <aprantl at apple.com> wrote:
>> 
>> On Jan 12, 2015, at 7:40 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>> 
>> On Mon, Jan 12, 2015 at 2:11 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>> 
>>> On Mon, Jan 12, 2015 at 1:56 PM, Richard Smith <richard at metafoo.co.uk>
>>> wrote:
>>>> 
>>>> On Fri, Jan 9, 2015 at 8:26 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>> 
>>>>> On Fri, Jan 9, 2015 at 5:02 PM, Richard Smith <richard at metafoo.co.uk>
>>>>> wrote:
>>>>>> 
>>>>>> On Fri, Jan 9, 2015 at 4:03 PM, Adrian Prantl <aprantl at apple.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>> On Jan 9, 2015, at 3:57 PM, Richard Smith <richard at metafoo.co.uk>
>>>>>>> wrote:
>>>>>>> 
>>>>>>> On Tue, Jan 6, 2015 at 10:07 AM, Adrian Prantl <aprantl at apple.com>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Dec 12, 2014, at 8:47 PM, Adrian Prantl <aprantl at apple.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Dec 12, 2014, at 5:37 PM, Argyrios Kyrtzidis
>>>>>>>>>> <kyrtzidis at apple.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Dec 12, 2014, at 4:33 PM, Eric Christopher
>>>>>>>>>>> <echristo at gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Debug info for types isn't inherently a code generation concept.
>>>>>>>>>>> If you think about it, debug info for types is a stable (if lossy)
>>>>>>>>>>> serialization method for a module file. The line number etc for when there's
>>>>>>>>>>> code generated is a separate issue.
>>>>>>>>>> 
>>>>>>>>>> I see what you mean, but it is a traditionally codegen product
>>>>>>>>>> with a particular use-case, and it’s not reasonable to force it on every
>>>>>>>>>> clang client that only wants to parse code, like libclang, static analyzers,
>>>>>>>>>> migrators, refactoring tools, etc., or builds that didn’t ask for it.
>>>>>>>>> 
>>>>>>>>> Good point, I tend to forget about non-compiler users of clang
>>>>>>>>> modules.
>>>>>>>>> 
>>>>>>>>> If we do decide that having clang modules without debug info is
>>>>>>>>> desirable, and we want debug info to be generated lazily (only when needed)
>>>>>>>>> then putting it into a separate file is preferable, because it then can be
>>>>>>>>> captured as a dependency by build systems.
>>>>>>>>> 
>>>>>>>>> It looks like at this point everyone’s argument is really depending
>>>>>>>>> on an assumption that emitting debug info is expensive (or really cheap!,
>>>>>>>>> respectively), so my suggestion is to revisit this thread once I actually
>>>>>>>>> have some numbers on how long it takes to emit debug info and how much space
>>>>>>>>> it takes up. I’ll try to get that done soon.
>>>>>>>> 
>>>>>>>> Hi Argyrios,
>>>>>>>> 
>>>>>>>> back from the break, here are the promised numbers to make our
>>>>>>>> decision easier:
>>>>>>>> 
>>>>>>>> I did an experiment where I patched clang to emit debug type info for
>>>>>>>> each type (patch attached for the curious), and compiled an empty program
>>>>>>>> that imports the Cocoa.h header. To compare the sizes I emitted the DWARF to
>>>>>>>> a separate file:
>>>>>>>> 
>>>>>>>> -rw-r--r--  1 adrian  staff  2151068 Dec 19 16:30
>>>>>>>> Foundation-3QM1BFEPXW18W.pcm
>>>>>>>> -rw-r--r--  1 adrian  staff   110772 Dec 19 16:30
>>>>>>>> Foundation-3QM1BFEPXW18W.pcm.o
>>>>>>>> 
>>>>>>>> here’s AppKit:
>>>>>>>> 
>>>>>>>> -rw-r--r--  1 adrian  staff  3302744 Dec 19 16:40
>>>>>>>> AppKit-5HXLHEH4UB4M.pcm
>>>>>>>> -rw-r--r--  1 adrian  staff   279080 Dec 19 16:40
>>>>>>>> AppKit-5HXLHEH4UB4M.pcm.o
>>>>>>>> 
>>>>>>>> The median of the size of the DWARF compared to the size of the pcm
>>>>>>>> over all the modules pulled in by Cocoa.h is 5%; i.e., the DWARF would take
>>>>>>>> up roughly 5% of the size of the individual modules.
>>>>>>>> 
>>>>>>>> From these numbers I would argue that DWARF emission is comparatively
>>>>>>>> cheap. To keep the implementation simple, I’d prefer to have everything in
>>>>>>>> one file; this way we won’t have to introduce another layer of locking for
>>>>>>>> creating the pcm.o files lazily, but if someone wants to point out that this
>>>>>>>> is a lame excuse, be my guest ;-)
>>>>>>>> [Another reason to argue for separate .pcm.o files is if we ever want
>>>>>>>> to put something target-specific in there, such as code. Currently this is
>>>>>>>> not the case,
>>>>>>> 
>>>>>>> 
>>>>>>> I certainly have plans to do this, as mentioned previously on this
>>>>>>> thread.
>>>>>>> 
>>>>>>>> 
>>>>>>>> and even if we did this, we would still benefit from having the DWARF
>>>>>>>> type information shared between the several .pcm.o files]
>>>>>>> 
>>>>>>> 
>>>>>>> Is there any disadvantage to having the debug information for a module
>>>>>>> split over two .o files (one for the types and another for the inline
>>>>>>> functions / template instantiations)?
>>>>>>> 
>>>>>>> 
>>>>>>> I think that having it split is actually an advantage. By split I mean
>>>>>>> having the .pcm which contains AST and the DWARF for the types ands then
>>>>>>> several .pcm.o’s for each target that contains e.g., IR for inline
>>>>>>> functions+debug info and the debug info in the various targets refers to the
>>>>>>> shared DWARF type info in the .pcm. As far as the debug info is concerned,
>>>>>>> we would use the same mechanisms for the .pcm.o files as we would for any
>>>>>>> other object that imports the module.
>>>>>> 
>>>>>> 
>>>>>> OK, I'm fine with that (though in our case I think we'll want to turn
>>>>>> this feature off and put all the DWARF output into the same file as the
>>>>>> inline functions etc). Do you have a plan for supporting debug fission with
>>>>>> this mode?
>>>>> 
>>>>> 
>>>>> The way I was thinking is that this is, in some sense, fission already.
>>>>> 
>>>>> We would put a simple module skeleton compile unit that represents the
>>>>> module in each object file compiled using that module - comdat it so it's
>>>>> dedup'd by the linker, and that would reference the pcm.o file just like we
>>>>> reference .dwo files today - and in there we'd have all the usual
>>>>> debug_types.dwo, etc.
>>>>> 
>>>>> So this /is/ fission.
>>>>> 
>>>>> If we wanted to split the debug info out from the module, I don't think
>>>>> this would really change - we'd just point at that other file instead.
>>>>> 
>>>>> (& when we eventually have inline functions and their debug info in the
>>>>> module, we could drop the comdat and just put the skeleton CU in that object
>>>>> file to be linked in directly (and to contain the debug info for those
>>>>> inline functions, etc))
>>>>> 
>>>>> Does that sound reasonable/make sense - I can flesh out some of the
>>>>> DWARF terminology I've used if it's unclear.
>>>> 
>>>> 
>>>> This is the answer I was hoping for / expecting, I just wanted to make
>>>> sure that this had been considered. To my mind, this means that it's neither
>>>> relevant nor necessary that the .pcm file is an ELF / MachO / COFF / etc.
>>>> object file, all that matters is that it's a file that DWARF readers are
>>>> able to read DWARF from (and a format that we can read Clang's PCM
>>>> information from). Does that give us any additional flexibility regarding
>>>> the format?
>>> 
>>> 
>>> My guess would be that this doesn't give us any additional flexibility
>>> today - I think GDB is the only implementation of Fission today and, while I
>>> don't know for sure, I don't have any reason to believe it can handle .dwo
>>> files in any format other than ELF (or perhaps generalized to any object
>>> file GDB can cope with on each platform it supports).
>>> 
>>>> 
>>>> One other change that I would like to be made with this one: fix
>>>> llvm-bcanalyzer so that it can read whatever file format we end up using for
>>>> .pcm files. We get several fringe benefits such as this from using bitcode,
>>>> and it would be unfortunate to lose them.
>>> 
>>> 
>>> Would it be sufficient to teach llvm-readelf or something to have options
>>> (if it doesn't have them already) to dump a specific section to stdout and
>>> you'd just pipe that to bcanalyzer?
>> 
>> 
>> That seems reasonable to me.
>> 
>>>> Adrian: have you looked at the file size increase for an empty module
>>>> from adding this wrapper format and skeleton/empty DWARF information? That'd
>>>> be an interesting data point (mostly just to assuage my concern here -- some
>>>> builds will have thousands of these files loaded, and a few dozen KiB per
>>>> PCM file adds up to a lot of address space).
>> 
>> 
>> 
>> That depends on the kind of container format that is used: In my experiment
>> an Mach-O wrapper for an empty module adds 280 bytes — that is without the
>> debug information. With all the debug info sections, the overhead for an
>> empty Objective-C module is 1,556 bytes out of a total 131,788 bytes
>> (surprised me, but it turns out there are quite a bunch of implicit
>> definitions even in an empty module).
> 
> That's PR21397; we can get it down to 2736 bytes if we strip out all
> the unnecessary bits. 1.5KiB per pcm file seems fine to me.
> 
>> Here is the final version of the patch that should address all the points
>> raised so far in this thread. Any final comments/objections/suggestions
>> before this can go into trunk?
> 
> Using a fixed string literal as a key for the PCH data seems a bit
> weird to me. I'm also not sure a module flag is the best way to model
> this: can we instead emit this as a normal global, and keep the
> knowledge of its special section name within Clang?
That sounds like a good idea — I wasn’t aware that globals could be annotated with a section name.
I’ll give it a shot.
> 
> I would imagine that you'll want to generate debug information via
> Clang's normal ASTConsumer -> CodeGen pipeline, which suggests that
> building a custom LLVM code generator and running it from within
> PCHGenerator::HandleTranslationUnit is the wrong choice. This should
> probably be handled by a FrontendAction that passes the bitcode
> generated by the PCHGenerator to CodeGen. That'd also avoid you
> needing to plumb a CompilerInstance into Serialization, which seems
> like a layering violation (Frontend depends on Serialization, not vice
> versa).

Thanks for the pointer. In this case both the code generator and the output buffer for the serialized AST is owned by a FrontendAction which then wraps it in a global and passes it on to CodeGen.
In order to wire up the debug information, since I can only have one ASTConsumer per FrontendAction, I will probably need to have a second FrontendAction for the debug info generator?
I’ll check out how clang works and how to communicate between FrontendActions.
I might be back with more questions :-)

> 
> For the section name (repeating a comment from before): I don't like
> "cfe"; we should indicate which compiler is involved here. I don't
> like "pch"; this is used for all kinds of AST files, not just PCH
> files. Can you pick a better name for the section?

That particular name was a suggestion by David Majnemer  who pointed out that COFF section names are restricted to 8 characters. I chose PCH because the magic number in .pcm files is CPCH.
The full list of constraints is:
- 8 characters (COFF)
- starts with __ (MachO)
- should indicate clang (COFF. On ELF and MachO the section is inside a __CLANG segment)
- should indicate that this is an ast

If we want to use the same section name on all platforms, this leaves any combination of
__{cfe|clg|cln}{ast|pch|pcm|mod}
or maybe
__ast
__cpch
__llvmod

other suggestions?
-- adrian