r241620 - Wrap clang modules and pch files in an object file container.

Mon Jul 13 19:43:10 PDT 2015

On Mon, Jul 13, 2015 at 7:27 PM Richard Smith <richard at metafoo.co.uk> wrote:

> On Mon, Jul 13, 2015 at 6:02 PM, Adrian Prantl <aprantl at apple.com> wrote:
>
>>
>> On Jul 13, 2015, at 5:47 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>
>> On Mon, Jul 13, 2015 at 3:06 PM, Adrian Prantl <aprantl at apple.com> wrote:
>>
>>> > On Jul 13, 2015, at 2:00 PM, Eric Christopher <echristo at gmail.com>
>>> wrote:
>>> >
>>> > Hi Adrian,
>>> >
>>> > Finally getting around to looking at some of this and I think it's
>>> going in slightly the wrong direction. In general I think begin -able- to
>>> put modules in object files to simplify wrapping, use, etc is a good thing.
>>> I think being required to do so is somewhat problematic.
>>> >
>>>
>>> Let me start with that the current infrastructure already allows
>>> selecting whether you want wrapped modules or not by passing the
>>> appropriate PCHContainerOperations object to CompilerInstance. Clang
>>> currently unconditionally uses an object file wrapper, all of
>>> clang-tools-extra doesn’t. We could easily control the behavior of clang
>>> based on a (new) command line option.
>>>
>>> But.. on a platform with a shared module cache you always have to assume
>>> that a module once built will eventually be used by a client that wants to
>>> read the debug info. Think llvm-dsymutil — it does not know and does not
>>> want to know how to build clang modules, but does want to read all the
>>> debug info from a clang module.
>>>
>>> > Imagine, for example, you have a giant distributed build system...
>>> >
>>> > You'd want to create a pile of modules (that may reference/include/etc
>>> other modules) that aren't don't or may not have debug information as part
>>> of them (because you might want to build without it or have the debug info
>>> alongside it as a separate compilation). Waiting on the full build of the
>>> module including debug is going to adversely affect your overall build time
>>> and so shouldn't be necessary - especially if you want to be able to have
>>> information separate ultimately.
>>> >
>>> > Make sense?
>>>
>>> Not sure if you would be saving much by having the debug info
>>> separately, from what I’ve measured so far the debug info for a module
>>> makes up less than 10% of the total size. Admittedly, build-time-wise going
>>> through the backend to emit the object file is a lot more expensive than
>>> just dumping the raw PCH. [1]
>>>
>>> Yeah, I think wanting to be able to control the behavior is reasonable,
>>> we just need to be careful what the implications for consumers are. If we
>>> add a, e.g., an “-fraw-modules” [2] or switch to clang to turn off the
>>> object file wrapping, I’d strongly suggest that we add the value of this
>>> switch to the module hash (or add a an optional “-g” to the module file
>>> name after the hash or something like that) to avoid ugly race conditions
>>> between debug info and non-debug-info builds of the same module. This way
>>> we’d have essentially two separate module caches, with and without debug
>>> info.
>>>
>>
>> That's fine, I think (we don't use a module cache at all in our build
>> system; it doesn't really make much sense for a distributed build) and most
>> command-line flag changes already have this effect.
>>
>>
>> Great!
>>
>>
>>
>>> would that work for you?
>>> -- adrian
>>>
>>> [1] If you want to be serious about building the module debug info in
>>> parallel to the rest of the build, you could even have a clang-based tool
>>> import the just-built raw clang module and emit the debug info without
>>> having to parse the headers again :-)
>>>
>>
>> That is what we intend to do :) (Assuming this turns out to actually be
>> faster than re-parsing; faulting in the entire contents of a module has
>> much worse locality than parsing.)
>>
>> [2] -fraw-modules, -fmodule-format-raw, -fmodule-debug-info, ...?
>>>     I would imagine that the driver enables module debug info when
>>> "-gmodules” is present and by default on Darwin.
>>
>>
>> That seems reasonable to me. For the frontend flag, I think a flag to
>> turn this on or to select the module format makes more sense than a flag to
>> switch to the raw format.
>>
>>
>> Okay then let’s narrow this down. Other possibilities in that direction
>> include (sorted from subjectively best to worst)
>>
>> -fmodule-format=obj
>> -fmodule-debug-info
>> -ffat-modules
>> -fmodule-container
>> -fmodule-container-object
>>
>
> It's a -cc1 flag, so it doesn't really matter much.
>

But eventually we need a reasonable way to support non-implicit-cache usage
of modules without passing -cc1 flags.

For some build systems and environments, I actually suspect that
non-implicit-cache builds will be the default.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20150714/feb781f9/attachment.html>