[cfe-dev] Controlling instantiation of templates from PCH

Lubos Lunak via cfe-dev cfe-dev at lists.llvm.org
Tue May 28 12:38:37 PDT 2019


On Tuesday 28 of May 2019, Nico Weber wrote:
> That's a cool observation!
>
> A question independent to the other discussions happening on this thread:
> Since you're comparing build times between MSVC and clang, do you use
> clang-cl in Windows builds?

 We do have support for clang-cl AFAIK, but I don't know how much it's used. 
The release binaries are built with MSVC and I think most developers are on 
Unix-likes anyway. I've never used clang-cl myself.

> The clang-cl / cl.exe PCH flags (/Yc, /Yu) 
> would allow implementing your suggested optimization without a need for any
> new driver flags.

 /Yc /Yu could act that way without extra flags, but I think they don't. I've 
just skimmed over the sources, so I may be mistaken, but it seems to me 
the /Yc mode is not different there. Unless /Yc somehow already instantiates 
everything in the PCH and avoids such instantiations in TUs using the PCH, 
there is still going to be the cost of Sema::PerformPendingInstantiations() 
doing something that's not needed. Remember that this is actually about 
improving the build time, not necessarily the build result.

> This seems similar to doing
> http://blog.llvm.org/2018/11/30-faster-windows-builds-with-clang-cl_14.html
> for all inlines, not just for dllexported ones.

 I think that's different. That one is like -fvisibility-inlines-hidden, which 
only causes inlines not to be exported. But they will still be processed.


> On Sat, May 25, 2019 at 3:32 PM Lubos Lunak via cfe-dev <
>
> cfe-dev at lists.llvm.org> wrote:
> >  Hello,
> >
> >  I'm working on a Clang patch that can make C++ builds noticeably faster
> > in
> > some setups by allowing control over how templates are instantiated, but
> > I have some problems finishing it and need advice.
> >
> >  Background: I am a LibreOffice developer. When enabling precompiled
> > headers,
> > e.g. for LO Calc precompiled headers save ~2/3 of build time when MSVC is
> > used, but with Clang they save only ~10%. Moreover the larger the PCH the
> > more time is saved with MSVC, but this is not so with Clang, in fact
> > larger
> > PCHs often make things slower.
> >
> >  The recent -ftime-trace feature allowed me to investigate this and it
> > turns
> > out that the time saved by having to parse less is outweighted by having
> > to
> > instantiate (many) more templates. You can see -ftime-trace graphs at
> >
> > http://llunak.blogspot.com/2019/05/why-precompiled-headers-do-not-improve
> >.html (1nd row - no PCH, 2nd row - small PCH, 3rd row - large PCH), the
> > .json files
> > are at http://ge.tt/7RHeLHw2 if somebody wants to see them.
> >
> >  Specifically, the time is spent in Sema::PerformPendingInstantiations()
> > and
> > Sema::InstantiateFunctionDefinition(). The vast majority of the
> > instantiations comes from the PCH itself. This means that this is
> > performed
> > for every TU using the PCH, and it also means that it's useless work, as
> > the
> > linker will discard all but one copy of that.
> >
> >  My WIP patch implements a new option to avoid that. The idea is that all
> > sources using the PCH will be built with
> > -fpch-template-instantiation=skip,
> > which will prevent Sema::InstantiateFunctionDefinition() from actually
> > instantiating templates coming from the PCH if they would be uneeded
> > duplicates (note that means almost all PCH template instantiations in the
> > case of a developer build with -O0 -g, which is my primary use case).
> > Then one extra source file is built with
> > -fpch-template-instantiation=force, which
> > will provide one copy of instantiations. I assume that this is similar to
> > how
> > MSVC manages to have much better gains with PCH, the .obj created during
> > PCH
> > creation presumably contains single instantiations.
> >
> >  In the -ftime-trace graphs linked above, the 4th row is large PCH with
> > my patch. The compilation time saved by this is 50% and 60% for the two
> > examples
> > (and I think moving some templates into the PCH might get it to 70-75%
> > for the second file).
> >
> >  As I said, I have some problems that prevent the patch from being fully
> > usable, so in order to finish it, could somebody help me with the
> > following:
> >
> > - I don't understand how it is controlled which kind of ctor/dtor is
> > emitted
> > (complete ctor vs base ctor, i.e. C1 vs C2 type in the Itanium ABI). I
> > get many undefined references because the TU built with instances does
> > not have
> > both types, yet other TUs refer to them. How can I force both of them be
> > emitted?
> >
> > - I have an undefined reference to one template function that should be
> > included in the TU with instances, but it isn't. The Sema part
> > instantiates
> > it and I could track it as far as getting generated in Codegen, but then
> > I'm
> > lost. I assume that it gets discarded because something in codegen or
> > llvm considers it unused. Is there a place like that and where is it? Are
> > there other places in codegen/llvm where I could check to see why this
> > function doesn't get generated in the object file?
> >
> > - In Sema::InstantiateFunctionDefinition() the code for extern templates
> > still
> > instantiates a function if it has getContainedAutoType(), so my code
> > should
> > probably also check that. But I'm not sure what that is (is that 'auto
> > foo()
> > { return 1; }' ?) or why that would need an instance in every TU.
> >
> > - I used BENIGN_ENUM_LANGOPT because Clang otherwise complains that the
> > PCH is
> > used with a different option than what it was generated with, which is
> > necessary in this case, but I'm not sure if this is the correct handling
> > of
> > the option.
> >
> > - Is there a simple rule for what decides that a template needs to be
> > instantiated? As far as I can tell, even using a template as a class
> > member
> > or having an inline member function manipulating it doesn't. When I
> > mentioned
> > moving some templates into the PCH in order to get possible 70% savings,
> > I actually don't know how to cause an instantiation from the PCH, the
> > templates
> > and their uses are included there.
> >
> >  Thank you.
> >
> > --
> >  Lubos Lunak
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
 Lubos Lunak



More information about the cfe-dev mailing list