[cfe-dev] Controlling instantiation of templates from PCH

Nico Weber via cfe-dev cfe-dev at lists.llvm.org
Tue May 28 09:14:04 PDT 2019


That's a cool observation!

A question independent to the other discussions happening on this thread:
Since you're comparing build times between MSVC and clang, do you use
clang-cl in Windows builds? The clang-cl / cl.exe PCH flags (/Yc, /Yu)
would allow implementing your suggested optimization without a need for any
new driver flags.

This seems similar to doing
http://blog.llvm.org/2018/11/30-faster-windows-builds-with-clang-cl_14.html
for all inlines, not just for dllexported ones.

On Sat, May 25, 2019 at 3:32 PM Lubos Lunak via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

>
>  Hello,
>
>  I'm working on a Clang patch that can make C++ builds noticeably faster
> in
> some setups by allowing control over how templates are instantiated, but I
> have some problems finishing it and need advice.
>
>  Background: I am a LibreOffice developer. When enabling precompiled
> headers,
> e.g. for LO Calc precompiled headers save ~2/3 of build time when MSVC is
> used, but with Clang they save only ~10%. Moreover the larger the PCH the
> more time is saved with MSVC, but this is not so with Clang, in fact
> larger
> PCHs often make things slower.
>
>  The recent -ftime-trace feature allowed me to investigate this and it
> turns
> out that the time saved by having to parse less is outweighted by having
> to
> instantiate (many) more templates. You can see -ftime-trace graphs at
>
> http://llunak.blogspot.com/2019/05/why-precompiled-headers-do-not-improve.html
> (1nd row - no PCH, 2nd row - small PCH, 3rd row - large PCH), the .json
> files
> are at http://ge.tt/7RHeLHw2 if somebody wants to see them.
>
>  Specifically, the time is spent in Sema::PerformPendingInstantiations()
> and
> Sema::InstantiateFunctionDefinition(). The vast majority of the
> instantiations comes from the PCH itself. This means that this is
> performed
> for every TU using the PCH, and it also means that it's useless work, as
> the
> linker will discard all but one copy of that.
>
>  My WIP patch implements a new option to avoid that. The idea is that all
> sources using the PCH will be built with
> -fpch-template-instantiation=skip,
> which will prevent Sema::InstantiateFunctionDefinition() from actually
> instantiating templates coming from the PCH if they would be uneeded
> duplicates (note that means almost all PCH template instantiations in the
> case of a developer build with -O0 -g, which is my primary use case). Then
> one extra source file is built with -fpch-template-instantiation=force,
> which
> will provide one copy of instantiations. I assume that this is similar to
> how
> MSVC manages to have much better gains with PCH, the .obj created during
> PCH
> creation presumably contains single instantiations.
>
>  In the -ftime-trace graphs linked above, the 4th row is large PCH with my
> patch. The compilation time saved by this is 50% and 60% for the two
> examples
> (and I think moving some templates into the PCH might get it to 70-75% for
> the second file).
>
>  As I said, I have some problems that prevent the patch from being fully
> usable, so in order to finish it, could somebody help me with the
> following:
>
> - I don't understand how it is controlled which kind of ctor/dtor is
> emitted
> (complete ctor vs base ctor, i.e. C1 vs C2 type in the Itanium ABI). I get
> many undefined references because the TU built with instances does not
> have
> both types, yet other TUs refer to them. How can I force both of them be
> emitted?
>
> - I have an undefined reference to one template function that should be
> included in the TU with instances, but it isn't. The Sema part
> instantiates
> it and I could track it as far as getting generated in Codegen, but then
> I'm
> lost. I assume that it gets discarded because something in codegen or llvm
> considers it unused. Is there a place like that and where is it? Are there
> other places in codegen/llvm where I could check to see why this function
> doesn't get generated in the object file?
>
> - In Sema::InstantiateFunctionDefinition() the code for extern templates
> still
> instantiates a function if it has getContainedAutoType(), so my code
> should
> probably also check that. But I'm not sure what that is (is that 'auto
> foo()
> { return 1; }' ?) or why that would need an instance in every TU.
>
> - I used BENIGN_ENUM_LANGOPT because Clang otherwise complains that the
> PCH is
> used with a different option than what it was generated with, which is
> necessary in this case, but I'm not sure if this is the correct handling
> of
> the option.
>
> - Is there a simple rule for what decides that a template needs to be
> instantiated? As far as I can tell, even using a template as a class
> member
> or having an inline member function manipulating it doesn't. When I
> mentioned
> moving some templates into the PCH in order to get possible 70% savings, I
> actually don't know how to cause an instantiation from the PCH, the
> templates
> and their uses are included there.
>
>  Thank you.
>
> --
>  Lubos Lunak
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190528/37dbe402/attachment.html>


More information about the cfe-dev mailing list