[cfe-dev] Controlling instantiation of templates from PCH
Lubos Lunak via cfe-dev
cfe-dev at lists.llvm.org
Tue May 28 02:31:19 PDT 2019
On Tuesday 28 of May 2019, David Blaikie wrote:
> So I'm not sure I understand this comment:
>
> "And, if you look carefully, 4 seconds more to generate code, most of it
> for those templates. And after the compiler spends all this time on
> templates in all the source files, it gets all passed to the linker, which
> will shrug and then throw most of it away (and that will too take a load of
> time, if you still happen to use the BFD linker instead of gold/lld
> <https://lists.freedesktop.org/archives/libreoffice/2018-July/080484.html>
> with -gsplit-dwarf -Wl,--gdb-index
> <https://lists.freedesktop.org/archives/libreoffice/2018-June/080437.html>)
>. What a marvel."
>
> What extra code generation occurred with the PCH? Any change in generated
> code with a PCH would surprise me.
If I understand it correctly, the small testcase from me means that adding a
PCH generally does not change the resulting object file, only make Clang
spend more time processing something it throws away as unused somewhen in the
later stages of creating the object file, so there's no extra code generation
caused by the PCH. So, to make it more clean what I meant there, it's more
like saying that there's a missed opportunity:
- Let's say that I have a library built from a.cpp and b.cpp, and both those
sources use std::vector< int >. As in, they really use it, so both a.o and
b.o end up with weak copies of std::vector< int > code.
- That seems to be basically inevitable with the normal non-PCH code, as the
Clang instance compiling a.cpp cannot know that std::vector< int > code will
be also present in b.o, and so both compiling a.cpp and b.cpp results in
generating std::vector< int >, even though we can clearly see it's
unnecessary.
- I say it's basically inevitable in the non-PCH case, because I don't know a
reasonable way to avoid that in practice. There is extern template, which
would work in this minimal testcase, but for a real-world large codebase I
find that impractical, tedious and what not (please correct if I'm wrong and
there is a reasonable way, but beware that I've already tried that and
decided that writing a compiler patch was an easier way of going about it).
- However, in the PCH case, both Clang instances do know that they share all
the template instantiations from the PCH. And that's where my patch steps in
and -fpch-template-instantiation=force tell one instance "take care of it
all" and -fpch-template-instantiation=skip tells all the other
instance "don't bother with those, somebody else will take care of that". So
all but one Clang instances can skip all those numerous
Sema::InstantiateFunctionDefinition() and also code generation for all of
those instances that actually are used in that TU.
- To put it differently, you can also view -fpch-template-instantiation=skip
as automatic extern template for whatever is used by the PCH,
and -fpch-template-instantiation=force as explicit instantiation for it,
where all the hassle of extern template is replaced by just putting all the
template stuff in the PCH. (To be precise, it's not exactly like explicit
instantiation, because it involves only what is instantiated by the PCH, but
if wanted that can be handled by actually explicitly instantiating in the
PCH, without having to bother with the extern template stuff).
--
Lubos Lunak
More information about the cfe-dev
mailing list