[cfe-dev] Controlling instantiation of templates from PCH

David Blaikie via cfe-dev cfe-dev at lists.llvm.org
Wed May 29 09:20:06 PDT 2019


On Tue, May 28, 2019 at 2:31 AM Lubos Lunak <l.lunak at centrum.cz> wrote:

> On Tuesday 28 of May 2019, David Blaikie wrote:
> > So I'm not sure I understand this comment:
> >
> > "And, if you look carefully, 4 seconds more to generate code, most of it
> > for those templates. And after the compiler spends all this time on
> > templates in all the source files, it gets all passed to the linker,
> which
> > will shrug and then throw most of it away (and that will too take a load
> of
> > time, if you still happen to use the BFD linker instead of gold/lld
> > <
> https://lists.freedesktop.org/archives/libreoffice/2018-July/080484.html>
> >  with -gsplit-dwarf -Wl,--gdb-index
> > <
> https://lists.freedesktop.org/archives/libreoffice/2018-June/080437.html>)
> >. What a marvel."
> >
> > What extra code generation occurred with the PCH? Any change in generated
> > code with a PCH would surprise me.
>
>
>  If I understand it correctly, the small testcase from me means that
> adding a
> PCH generally does not change the resulting object file, only make Clang
> spend more time processing something it throws away as unused somewhen in
> the
> later stages of creating the object file, so there's no extra code
> generation
> caused by the PCH. So, to make it more clean what I meant there, it's more
> like saying that there's a missed opportunity:
>
> - Let's say that I have a library built from a.cpp and b.cpp, and both
> those
> sources use std::vector< int >. As in, they really use it, so both a.o and
> b.o end up with weak copies of std::vector< int > code.
> - That seems to be basically inevitable with the normal non-PCH code, as
> the
> Clang instance compiling a.cpp cannot know that std::vector< int > code
> will
> be also present in b.o, and so both compiling a.cpp and b.cpp results in
> generating std::vector< int >, even though we can clearly see it's
> unnecessary.
> - I say it's basically inevitable in the non-PCH case, because I don't
> know a
> reasonable way to avoid that in practice. There is extern template, which
> would work in this minimal testcase, but for a real-world large codebase I
> find that impractical, tedious and what not (please correct if I'm wrong
> and
> there is a reasonable way, but beware that I've already tried that and
> decided that writing a compiler patch was an easier way of going about it).
> - However, in the PCH case, both Clang instances do know that they share
> all
> the template instantiations from the PCH. And that's where my patch steps
> in
> and -fpch-template-instantiation=force tell one instance "take care of it
> all" and -fpch-template-instantiation=skip tells all the other
> instance "don't bother with those, somebody else will take care of that".
> So
> all but one Clang instances can skip all those numerous
> Sema::InstantiateFunctionDefinition() and also code generation for all of
> those instances that actually are used in that TU.
> - To put it differently, you can also view
> -fpch-template-instantiation=skip
> as automatic extern template for whatever is used by the PCH,
> and -fpch-template-instantiation=force as explicit instantiation for it,
> where all the hassle of extern template is replaced by just putting all
> the
> template stuff in the PCH. (To be precise, it's not exactly like explicit
> instantiation, because it involves only what is instantiated by the PCH,
> but
> if wanted that can be handled by actually explicitly instantiating in the
> PCH, without having to bother with the extern template stuff).
>

Ah, OK. Was this an indirect benefit of the feature/patch you created - or
did you specifically code for that in addition to the moving the pending
instantiations out to the separate PCH processing stage, rather than in
every compilation that uses the PCH - or did it come out as a happy
coincidence?

In any case, talking to Richard Smith about all this, here's some things:

* Clang header modules don't have the pending instantiation performance
problem described here - because they handle the pending instantiations at
the end of building the module, rather than in every consumer.
* It's possible moving PCH to the modules semantics might be valid in
general, or good enough to put behind a flag. (doing this in general would
of course be easier, code-wise - fewer supported code paths, etc)
* Moving the pending instantiation processing to the end of the PCH would
make PCH generation a little slower, but given a project would only have
one PCH that might not be a huge problem.
* In addition to that, we could support -fmodules-codegen/debuginfo - which
would implement the "building an object from the PCH" step you've described
using existing infrastructure in Clang (& that would then include other
non-template inline functions, so it'd be a bit broader)
* Then we could potentially do something more like what you're proposing
here - if modules-codegen is used, defer pending instantiations from the
initial module/PCH creation step, to the module/PCH-to-object step, to
speed up module/PCH generation & unblock the downstream compilations that
use it

- Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190529/bd91abda/attachment.html>


More information about the cfe-dev mailing list