[cfe-dev] JumboSupport: making unity builds easier in Clang

via cfe-dev cfe-dev at lists.llvm.org
Wed Apr 11 10:08:42 PDT 2018

If you want to share ASTs (an ephemeral structure) clang would need to do the distributing.  If you want to share IR of instantiated templates, you can do a shared database where clang is much less involved in managing the distribution.  Say the database key can be maybe a hash of the token stream of the template definition would work?  plus the template parameters.  Then you can pull precompiled IR out of the database (if you want to do optimizations) or make a reference to it (if you're doing LTO).

From: cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] On Behalf Of David Blaikie via cfe-dev
Sent: Wednesday, April 11, 2018 11:09 AM
To: David Chisnall
Cc: Bruce Dawson; Daniel Cheng; richard at metafoo.co.uk; cfe-dev at lists.llvm.org; Daniel Bratell; Jens Widell
Subject: Re: [cfe-dev] JumboSupport: making unity builds easier in Clang

This would have issues with distributed builds, though, right? Unless clang then took on the burden of doing the distribution too, which might be a bit much.
On Wed, Apr 11, 2018 at 12:43 AM David Chisnall via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
On 10 Apr 2018, at 21:28, Daniel Bratell via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
> I've heard (hearsay, I admit) from profiling that it seems the single largest time consumer in clang is template instantiation, something I assume can't easily be prepared in advance.
> One example is chromium's chrome/browser/browser target which is 732 files that normally need 6220 CPU seconds to compile, average 8,5 seconds per file. All combined together gives a single translation unit that takes 400 seconds to compile, a mere 0.54 seconds on average per file. That indicates that about 8 seconds per compiled file is related to the processing of headers.

It sounds as if there are two things here:

1. The time taken to parse the headers
2. The time taken to repeatedly instantiate templates that the linker will then discard

Assuming a command line where all of the relevant source files are provided to the compiler invocation:

Solving the first one is relatively easy if the files have a common prefix (which can be determined by simple string comparison).  Find the common prefix in the source files, build the clang AST, and then do a clone for each compilation unit.  Hopefully, the clone is a lot cheaper than re-parsing (and can ideally share source locations).

The second is slightly more difficult, because it relies on sharing parts of the AST across notional compilation units.

To make this work well with incremental builds, ideally you’d spit out all of the common template instantiations into a separate IR file, which could then be used with ThinLTO.

Personally, I would prefer to have an interface where a build system can invoke clang with all of the files that need building and the degree of parallelism to use and let it share as much state as it wants across builds.  In an ideal world, clang would record which templates have been instantiated in a prior build (or a previous build step in the current build) and avoid any IRGen for them, at the very least.

Old C++ compilers, predating linker support for COMDATs, emitted templates lazily, simply emitting references to them, then parsing the linker errors and generating missing implementations until the linker errors went away.  Modern C++ compilers generate many instantiations of the same templates and then discard most of them.  It would be nice to find an intermediate point, which worked well with ThinLTO, where templates could be emitted once and be available for inlining everywhere.


cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180411/bb8ef68f/attachment.html>

More information about the cfe-dev mailing list