[cfe-dev] Controlling instantiation of templates from PCH

David Blaikie via cfe-dev cfe-dev at lists.llvm.org
Sun May 26 15:45:28 PDT 2019


On Sun, May 26, 2019 at 2:36 PM Richard Smith <richard at metafoo.co.uk> wrote:

> On Sun, 26 May 2019 at 13:26, David Blaikie via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> Thanks Richard - yeah sounds pretty similar though I'm a bit confused
>> about what's happening in this case, in part because I know next to nothing
>> about how Clang's PCH works (& especially how it differs from PCM/modules).
>>
>> Richard: why would any module or PCH cause a subsequent compilation to
>> perform more pending instantiations? (I would've thought/my understanding
>> was that nothing in the module would be used if it wasn't referenced from
>> the source file, so why would a pch cause more pending instantiations?)
>>
>
> Our design philosophy for modules and preamble precompilation is for a
> compilation using a precompiled header / preamble to behave identically to
> a compilation that parsed the header rather than using a precompiled form.
> So we don't perform end-of-translation-unit template instantiation at the
> end of a precompiled header, and instead perform the instantiation (and
> emit all the instantiated definitions and likewise all definitions of all
> used inline functions in the PCH) in all consumers of the PCH.
>

OK - thanks. That makes sense.

Though do you know if/how any of this could account for /more/ time spent
with pending instantiations with a PCH than without? (assuming the same
headers are included - and that's perhaps where the assumption is
incorrect/flawed, perhaps in Lubos's case the PCH is being added in
addition to the headers used in the non-PCH build, rather than instead of)
- and this shouldn't ever result in more/different bits in the object file
(assuming there's nothing with external linkage* in the PCH), right?

* no doubt more nuanced than that, but at least rough idea


>
>
>> Lubos: Could you provide a small standalone example of this increase in
>> pending instantiations so it's a bit easier for me to understand the kind
>> of code & what's happening?
>> You mentioned in the blog post that the use of a PCH causes more
>> functions to be emitted into the final object file (than if a PCH had not
>> been used, and the source remained the same). Especially the possibility of
>> functions being emitted into the object file that are totally unused by the
>> object file. (again, I'm especially interested in comparing the non-PCH
>> with the PCH case here, rather than the Clang PCH with the VS PCH
>> situation) - those are situations that would be very surprising to me.
>>
>> On Sat, May 25, 2019 at 6:38 PM Richard Smith <richard at metafoo.co.uk>
>> wrote:
>>
>>> This seems like a nice idea, and has a lot in common with our existing
>>> "modular codegen" mode, which does largely the same thing but for PCMs
>>> rather than PCHs. I'd hope we could share a lot of the implementation
>>> between the two features.
>>>
>>> +David Blaikie, who implemented modular codegen and might be able to
>>> advise as to the best way to integrate similar functionality into our PCH
>>> support.
>>>
>>> On Sat, 25 May 2019 at 12:32, Lubos Lunak via cfe-dev <
>>> cfe-dev at lists.llvm.org> wrote:
>>>
>>>>
>>>>  Hello,
>>>>
>>>>  I'm working on a Clang patch that can make C++ builds noticeably
>>>> faster in
>>>> some setups by allowing control over how templates are instantiated,
>>>> but I
>>>> have some problems finishing it and need advice.
>>>>
>>>>  Background: I am a LibreOffice developer. When enabling precompiled
>>>> headers,
>>>> e.g. for LO Calc precompiled headers save ~2/3 of build time when MSVC
>>>> is
>>>> used, but with Clang they save only ~10%. Moreover the larger the PCH
>>>> the
>>>> more time is saved with MSVC, but this is not so with Clang, in fact
>>>> larger
>>>> PCHs often make things slower.
>>>>
>>>>  The recent -ftime-trace feature allowed me to investigate this and it
>>>> turns
>>>> out that the time saved by having to parse less is outweighted by
>>>> having to
>>>> instantiate (many) more templates. You can see -ftime-trace graphs at
>>>>
>>>> http://llunak.blogspot.com/2019/05/why-precompiled-headers-do-not-improve.html
>>>> (1nd row - no PCH, 2nd row - small PCH, 3rd row - large PCH), the .json
>>>> files
>>>> are at http://ge.tt/7RHeLHw2 if somebody wants to see them.
>>>>
>>>>  Specifically, the time is spent in
>>>> Sema::PerformPendingInstantiations() and
>>>> Sema::InstantiateFunctionDefinition(). The vast majority of the
>>>> instantiations comes from the PCH itself. This means that this is
>>>> performed
>>>> for every TU using the PCH, and it also means that it's useless work,
>>>> as the
>>>> linker will discard all but one copy of that.
>>>>
>>>>  My WIP patch implements a new option to avoid that. The idea is that
>>>> all
>>>> sources using the PCH will be built with
>>>> -fpch-template-instantiation=skip,
>>>> which will prevent Sema::InstantiateFunctionDefinition() from actually
>>>> instantiating templates coming from the PCH if they would be uneeded
>>>> duplicates (note that means almost all PCH template instantiations in
>>>> the
>>>> case of a developer build with -O0 -g, which is my primary use case).
>>>> Then
>>>> one extra source file is built with -fpch-template-instantiation=force,
>>>> which
>>>> will provide one copy of instantiations. I assume that this is similar
>>>> to how
>>>> MSVC manages to have much better gains with PCH, the .obj created
>>>> during PCH
>>>> creation presumably contains single instantiations.
>>>>
>>>>  In the -ftime-trace graphs linked above, the 4th row is large PCH with
>>>> my
>>>> patch. The compilation time saved by this is 50% and 60% for the two
>>>> examples
>>>> (and I think moving some templates into the PCH might get it to 70-75%
>>>> for
>>>> the second file).
>>>>
>>>>  As I said, I have some problems that prevent the patch from being
>>>> fully
>>>> usable, so in order to finish it, could somebody help me with the
>>>> following:
>>>>
>>>> - I don't understand how it is controlled which kind of ctor/dtor is
>>>> emitted
>>>> (complete ctor vs base ctor, i.e. C1 vs C2 type in the Itanium ABI). I
>>>> get
>>>> many undefined references because the TU built with instances does not
>>>> have
>>>> both types, yet other TUs refer to them. How can I force both of them
>>>> be
>>>> emitted?
>>>>
>>>> - I have an undefined reference to one template function that should be
>>>> included in the TU with instances, but it isn't. The Sema part
>>>> instantiates
>>>> it and I could track it as far as getting generated in Codegen, but
>>>> then I'm
>>>> lost. I assume that it gets discarded because something in codegen or
>>>> llvm
>>>> considers it unused. Is there a place like that and where is it? Are
>>>> there
>>>> other places in codegen/llvm where I could check to see why this
>>>> function
>>>> doesn't get generated in the object file?
>>>>
>>>> - In Sema::InstantiateFunctionDefinition() the code for extern
>>>> templates still
>>>> instantiates a function if it has getContainedAutoType(), so my code
>>>> should
>>>> probably also check that. But I'm not sure what that is (is that 'auto
>>>> foo()
>>>> { return 1; }' ?) or why that would need an instance in every TU.
>>>>
>>>> - I used BENIGN_ENUM_LANGOPT because Clang otherwise complains that the
>>>> PCH is
>>>> used with a different option than what it was generated with, which is
>>>> necessary in this case, but I'm not sure if this is the correct
>>>> handling of
>>>> the option.
>>>>
>>>> - Is there a simple rule for what decides that a template needs to be
>>>> instantiated? As far as I can tell, even using a template as a class
>>>> member
>>>> or having an inline member function manipulating it doesn't. When I
>>>> mentioned
>>>> moving some templates into the PCH in order to get possible 70%
>>>> savings, I
>>>> actually don't know how to cause an instantiation from the PCH, the
>>>> templates
>>>> and their uses are included there.
>>>>
>>>>  Thank you.
>>>>
>>>> --
>>>>  Lubos Lunak
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>
>>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190526/e920dd9d/attachment.html>


More information about the cfe-dev mailing list