[cfe-dev] JumboSupport: making unity builds easier in Clang

Tue Apr 10 13:28:38 PDT 2018

As a data point: Inside Chromium the time to process headers is typically  
80-95% of the total time processing a cc file. Maybe not surprising when  
the headers are around 240k lines, and the cc files themselves 50-500  
lines. Most of the compile time remained even with precompiled headers on  
Windows.

I've heard (hearsay, I admit) from profiling that it seems the single  
largest time consumer in clang is template instantiation, something I  
assume can't easily be prepared in advance.

One example is chromium's chrome/browser/browser target which is 732 files  
that normally need 6220 CPU seconds to compile, average 8,5 seconds per  
file. All combined together gives a single translation unit that takes 400  
seconds to compile, a mere 0.54 seconds on average per file. That  
indicates that about 8 seconds per compiled file is related to the  
processing of headers.

Our default jumbo configuration makes groups of 8 (when having access to  
Google's internal distributed compilation system) or 50 (for single  
computer compilation) files which loses half or more of the potential  
speedup for a much faster single-file turnaround and better use of  
parallel hardware.

To comment on some earlier things mentioned: The value of jumbo is in the  
results, the massive compile time speedup. It can also be used for "cheap"  
full program/module optimization (I measured a 1-2% speedup on Speedometer  
with a jumbo build, along with a 2% increase of the binary size, all  
compared to a normal non-PGO/LTO/FPO build) and it reduces disk usage and  
makes linking faster, but the main point for us is that it makes  
compilations so much faster.

The main downside is that you have to slightly adjust the source where  
"slightly" in a code base of 10-20 million lines can be noticeable. That  
is where this proposed clang feature enters. It would both reduce the  
initial amount of changes needed, and it removes the distraction that it  
would be for a developer to have to consider other code in other files  
when writing new code.

/Daniel

On Tue, 10 Apr 2018 21:40:30 +0200, Bruce Dawson  
<brucedawson at chromium.org> wrote:

>> you'd still repeatedly lex and preprocess the files #included into both  
>> source files
>
> That is where the high cost of translation units comes from, so I don't  
> think the 'abililty to parse one file, then make it "invisible"' will  
> help build >performance. To be clear, the per-translation unit cost is  
> not from firing up the compiler, it's from parsing/lexing/preprocessing  
> millions of lines of >header files, and associated code generation.
>
>> With a unity build, you now instead need to rebuild the concatenation  
>> of that .cc file and a bunch of others.
>
> True. But a pragmatic unity/jumbo build system understands and manages  
> this risk, by keeping the number of source files that are #included down  
> to >a reasonable level. Even when jumbo concatenates 50 source files  
> together the compilation cost for that blob is far less than 50 times  
> the cost of >compiling one file. It's an issue, to be sure, but not a  
> fatal flaw.
>
> On Tue, Apr 10, 2018 at 12:13 PM Richard Smith <richard at metafoo.co.uk>  
> wrote:
>> On 10 April 2018 at 10:05, Nico Weber via cfe-dev  
>> <cfe-dev at lists.llvm.org> wrote:
>>> On Tue, Apr 10, 2018 at 1:01 PM, David Blaikie <dblaikie at gmail.com>  
>>> wrote:
>>>>
>>>>
>>>>>>>> On Tue, Apr 10, 2018 at 9:58 AM Nico Weber <thakis at chromium.org>  
>>>>>>>> wrote:
>>>>> On Tue, Apr 10, 2018 at 11:56 AM, David Blaikie <dblaikie at gmail.com>  
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>>>>>>> On Tue, Apr 10, 2018 at 8:52 AM Mostyn Bramley-Moore  
>>>>>>>>>>>> <mostynb at vewd.com> wrote:
>>>>>>> On Tue, Apr 10, 2018 at 4:27 PM, David Blaikie  
>>>>>>> <dblaikie at gmail.com> wrote:
>>>>>>>> I haven't looked at the patches in detail - but generally a jumbo  
>>>>>>>> build feels like a bit of a workaround & maybe there are better  
>>>>>>>> long->>>>>>>>term solutions that might fit into the compiler. A  
>>>>>>>> few sort of background questions:
>>>>>>>>
>>>>>>>> * Have you tried Clang header modules (  
>>>>>>>> https://clang.llvm.org/docs/Modules.html )? (explicit (granted,  
>>>>>>>> explicit might only be >>>>>>>>practical at the moment using  
>>>>>>>> Google's internal version of Bazel - but you /might/ get some  
>>>>>>>> comparison numbers from a Google >>>>>>>>Chrome developer) and  
>>>>>>>> implicit)
>>>>>>>>  * The doc talks about maybe disabling jumbo builds for a single  
>>>>>>>> target for developer efficiency, with the risk that a header edit  
>>>>>>>> would >>>>>>>>maybe be worse for the developer than the jumbo  
>>>>>>>> build - this is where modules would help as well, since it  
>>>>>>>> doesn't have this tradeoff >>>>>>>>property of two different  
>>>>>>>> dimensions of "more work" you have to choose from.
>>>>>>>
>>>>>>> There are ways to minimise this- an earlier proprietary jumbo  
>>>>>>> build system used at Opera would detect when you're modifying and  
>>>>>>> >>>>>>>rebuilding files, and compile these in "normal" mode.  This  
>>>>>>> gave fast full/clean build times but also short modify+rebuild  
>>>>>>> times.  We have >>>>>>>not attempted to implement this in the  
>>>>>>> Chromium Jumbo build configuration.
>>>>>>
>>>>>> Building that kind of infrastructure seems like a pretty big hammer  
>>>>>> compared to modularizing the codebase...
>>>>>
>>>>> Modularizing the codebase doesn't give you the same build time  
>>>>> impact, linearizes your build more,
>>>>
>>>> Not sure I follow - it partially linearizes (as you say, due to the  
>>>> module dependency rather than header dependency issue), as does the  
>>>> jumbo >>>>build.
>>>
>>> The jumbo build just needs to append a bunch of files, that's fast.  
>>> Compiling a module isn't.
>>
>> Well, compiling a module is just appending a bunch of headers and  
>> compiling them. It's just at a different layer of the graph.
>>
>>>>>>>>>>>> and slows down incremental builds.
>>>>
>>>> Compared to a traditional build? I wouldn't think so (I mean, yes,  
>>>> reading/writing modules has some overhead - but also some gains) on  
>>>> >>>>average. I'd expect slower builds if you modify a header at the  
>>>> very base of the dependency (the STL), but beyond that I would've  
>>>> thought the >>>>reading/writing modules overhead would be saved by  
>>>> reusing modules for infrequently modified files (like the STL).
>>>
>>> Say you touch some header foo.h. Previously, you needed to rebuild all  
>>> cc files including it. Now you need to instead rebuild the module, and  
>>> >>>since the module has changed you now need to rebuild all cc files  
>>> using any header in the module, not just the users of foo.h. That's  
>>> potentially >>>way more cc files.
>>
>> But say you touch some source file foo.cc. Previously, and with  
>> modules, you just need to rebuild that cc file. With a unity build, you  
>> now instead >>need to rebuild the concatenation of that .cc file and a  
>> bunch of others. That's also potentially way more cc files. :)
>>
>> But measurements beat speculation here.
>>
>>>> (wonder what the combination would be like - modularizing headers,  
>>>> and also jumbo-ifying .cpp files together... - whether there's much  
>>>> to be >>>>saved in the reading modules part of the work, reading them  
>>>> in fewer times - that gets into some of the ideas of compiler as a  
>>>> service I guess)
>>>>
>>>>> Even if it wasn't a lot more work to get modules going, it's not  
>>>>> completely clear to me that that would address the use case that the  
>>>>> people >>>>>working on the jumbo build have.
>>>>>
>>>>>> (maybe still less work - but a lot of work to workaround things &  
>>>>>> produce some rather quirky behavior (in terms of how the build  
>>>>>> functions >>>>>>based on looking at exactly how the source files  
>>>>>> have changed & changing the build action graph depending on that) -  
>>>>>> but enough that I'd >>>>>>be inclined to reconsider going in the  
>>>>>> modular direction again)
>>>>>>
>>>>>>>>>>>>>
>>>>>>>> * I was going to ask about the lack of parallelism in a jumbo  
>>>>>>>> build - but reading the doc I see it's not a 'full' jumbo build,  
>>>>>>>> but >>>>>>>>chunkifying the build - so there's still some/enough  
>>>>>>>> parallelism. Cool :)
>>>>>>>
>>>>>>> I have heard rumours of some codebases in the games industry using  
>>>>>>> a single jumbo source file for the entire build, but this is  
>>>>>>> >>>>>>>generally considered to be taking things too far and not  
>>>>>>> our intended use case.
>>>>>>
>>>>>> Ah, my understanding was that jumbo builds were often/mainly used  
>>>>>> for optimized builds to get cross-module optimizations (LTO-esque)  
>>>>>> >>>>>>& so it'd be likely to be the whole program.
>>>>>>
>>>>>>> The size of Chromium's jumbo compilation units is tunable- you can  
>>>>>>> simply #include fewer real source files per jumbo source file- the  
>>>>>>> >>>>>>>bigger your build farm is, the smaller you want this number  
>>>>>>> to be.  The optimal setup depends on things like the shape of the  
>>>>>>> >>>>>>>dependency graph and the relative costs of the original  
>>>>>>> source files.  IIRC we currently only have build-wide  
>>>>>>> "jumbo_file_merge_limit" >>>>>>>setting, though that might have  
>>>>>>> changed since I last looked (V8 would benefit from this, since its  
>>>>>>> source files compile more slowly than >>>>>>>most Chromium source  
>>>>>>> files).  
>>>>>>>
>>>>>>> -Mostyn.
>>>>>>>
>>>>>>>> On Tue, Apr 10, 2018 at 5:12 AM Mostyn Bramley-Moore via cfe-dev  
>>>>>>>> <cfe-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I am a member of a small group of Chromium developers who are  
>>>>>>>>> working on adding a unity build[1] setup >>>>>>>>>to  
>>>>>>>>> Chromium[2], in order to reduce the project's long and  
>>>>>>>>> ever-increasing compile times.  We're calling >>>>>>>>>these  
>>>>>>>>> "jumbo" builds, because this term is not as overloaded as  
>>>>>>>>> "unity".
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We're slowly making progress, but find that a lot of our time is  
>>>>>>>>> spent renaming things in anonymous >>>>>>>>>namespaces- it would  
>>>>>>>>> be much simpler if it was possible to automatically treat these  
>>>>>>>>> as if they were file->>>>>>>>>local.   Jens Widell has put  
>>>>>>>>> together a proof-of-concept which appears to work reasonably  
>>>>>>>>> well, it consists of >>>>>>>>>a clang plugin and a small clang  
>>>>>>>>> patch:
>>>>>>>>>
>>>>>>>>> https://github.com/jensl/llvm-project-20170507/tree/wip/jumbo-support/v1
>>>>>>>>>
>>>>>>>>> https://github.com/jensl/llvm-project-20170507/commit/a00d5ce3f20bf1c7a41145be8b7a3a478df9935f
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> After building clang and the plugin, you generate jumbo source  
>>>>>>>>> files that look like:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> jumbo_source_1.cc:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> #pragma jumbo
>>>>>>>>>
>>>>>>>>> #include "real_source_file_1.cc"
>>>>>>>>>
>>>>>>>>> #include "real_source_file_2.cc"
>>>>>>>>>
>>>>>>>>> #include "real_source_file_3.cc"
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Then, you compile something like this:
>>>>>>>>>
>>>>>>>>> clang++ -c jumbo_source_1.cc -Xclang -load -Xclang  
>>>>>>>>> lib/JumboSupport.so -Xclang -add-plugin -Xclang  
>>>>>>>>> >>>>>>>>>jumbo-support
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The plugin gives unique names[3] to the anonymous namespaces  
>>>>>>>>> without otherwise changing their >>>>>>>>>semantics, and also  
>>>>>>>>> #undef's macros defined in each top-level source file before  
>>>>>>>>> processing the next top->>>>>>>>>level source file.  That way  
>>>>>>>>> header files can still define macros that are used in multiple  
>>>>>>>>> source files in the >>>>>>>>>jumbo translation unit.  Collisions  
>>>>>>>>> between macros defined in header files and names used in other  
>>>>>>>>> headers >>>>>>>>>and other source files are still possible, but  
>>>>>>>>> less likely.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To show how much these two changes help, here's a patch to make  
>>>>>>>>> Chromium's network code build in >>>>>>>>>jumbo mode:
>>>>>>>>>
>>>>>>>>> https://chromium-review.googlesource.com/c/chromium/src/+/966523  
>>>>>>>>> (+352/-377 lines)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> And here's the corresponding patch using the proof-of-concept  
>>>>>>>>> JumboSupport plugin:
>>>>>>>>>
>>>>>>>>> https://chromium-review.googlesource.com/c/chromium/src/+/962062  
>>>>>>>>> (+53/-52 lines)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It seems clear that the version using the JumboSupport plugin  
>>>>>>>>> would require less effort to create, review and >>>>>>>>>merge  
>>>>>>>>> into the codebase.  We have a few other feature ideas, but these  
>>>>>>>>> two changes seem to do most of >>>>>>>>>the work for us.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So now we're trying to figure out the best way forward- would a  
>>>>>>>>> feature like this be welcome to the Clang >>>>>>>>>project?  And  
>>>>>>>>> if so, how would you recommend that we go about it?  We would  
>>>>>>>>> prefer to do this in a way that >>>>>>>>>does not require a  
>>>>>>>>> locally patched Clang and could live with building a custom  
>>>>>>>>> plugin, although implementing >>>>>>>>>this entirely in Clang  
>>>>>>>>> would be even better.
>>
>> I've been thinking about ways to get the benefits of unity builds  
>> without the semantic changes. With the functionality we introduced for  
>> -fmodules->>local-submodule-visibility, we have the abililty to parse  
>> one file, then make it "invisible" and parse another file, skipping all  
>> the repeated parts from >>the two parses, which would give us some  
>> (maybe most) of the performance benefit of unity builds without the  
>> semantic changes. (This is not quite >>as good as a unity build: you'd  
>> still repeatedly lex and preprocess the files #included into both  
>> source files. We could implicitly treat header files >>with include  
>> guards as being "modular" to get the performance back, but then you  
>> also get back some of the semantic changes.)
>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Mostyn.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] If you're not familiar with unity builds, the idea is to  
>>>>>>>>> compile multiple source files per compiler invocation,  
>>>>>>>>> >>>>>>>>>reducing the overhead of processing header files (which  
>>>>>>>>> can be surprisingly high).  We do this by taking a >>>>>>>>>list  
>>>>>>>>> of the source files in a target and generating "jumbo" source  
>>>>>>>>> files that #include multiple "real" source >>>>>>>>>files, and  
>>>>>>>>> then we feed these jumbo files to the compiler one at a time.   
>>>>>>>>> This way, we don't prevent the >>>>>>>>>usage of valuable build  
>>>>>>>>> tools like ccache and icecc that only support a single source  
>>>>>>>>> file on the command >>>>>>>>>line.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [2] Daniel Bratell has a summary of our progress jumbo-ifying  
>>>>>>>>> the Chromium codebase here:
>>>>>>>>>
>>>>>>>>> https://docs.google.com/document/d/19jGsZxh7DX8jkAKbL1nYBa5rcByUL2EeidnYsoXfsYQ/edit#
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [3] The JumboSupport plugin assigns names to the anonymous  
>>>>>>>>> namespaces in a given file:  foo::>>>>>>>>>(anonymous  
>>>>>>>>> namespace)::bar is replaced with a symbol name of the form  
>>>>>>>>> >>>>>>>>>foo::__anonymous_<number>::bar where <number> is unique  
>>>>>>>>> to the file within the jumbo translation unit.  >>>>>>>>>Due to  
>>>>>>>>> the internal linkage of these symbols, <number> does not need to  
>>>>>>>>> be unique across multiple object >>>>>>>>>files/jumbo source  
>>>>>>>>> files.
>>>>>>>>>
>>>>>>>>> --Mostyn Bramley-Moore
>>>>>>>>> Vewd Software
>>>>>>>>> mostynb at vewd.com
>>>>>>>>> _______________________________________________
>>>>>>>>> cfe-dev mailing list
>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --Mostyn Bramley-Moore
>>>>>>> Vewd Software
>>>>>>> mostynb at vewd.com
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
/* Opera Software, Linköping, Sweden: CEST (UTC+2) */
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180410/987a67e0/attachment.html>