[cfe-dev] JumboSupport: making unity builds easier in Clang

Tue Apr 10 16:41:47 PDT 2018

On Tue, Apr 10, 2018 at 9:13 PM, Richard Smith via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> On 10 April 2018 at 10:05, Nico Weber via cfe-dev <cfe-dev at lists.llvm.org>
> wrote:
>
>> On Tue, Apr 10, 2018 at 1:01 PM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Tue, Apr 10, 2018 at 9:58 AM Nico Weber <thakis at chromium.org> wrote:
>>>
>>>> On Tue, Apr 10, 2018 at 11:56 AM, David Blaikie <dblaikie at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 10, 2018 at 8:52 AM Mostyn Bramley-Moore <mostynb at vewd.com>
>>>>> wrote:
>>>>>
>>>>>> On Tue, Apr 10, 2018 at 4:27 PM, David Blaikie <dblaikie at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I haven't looked at the patches in detail - but generally a jumbo
>>>>>>> build feels like a bit of a workaround & maybe there are better long-term
>>>>>>> solutions that might fit into the compiler. A few sort of background
>>>>>>> questions:
>>>>>>>
>>>>>>> * Have you tried Clang header modules (
>>>>>>> https://clang.llvm.org/docs/Modules.html )? (explicit (granted,
>>>>>>> explicit might only be practical at the moment using Google's internal
>>>>>>> version of Bazel - but you /might/ get some comparison numbers from a
>>>>>>> Google Chrome developer) and implicit)
>>>>>>>   * The doc talks about maybe disabling jumbo builds for a single
>>>>>>> target for developer efficiency, with the risk that a header edit would
>>>>>>> maybe be worse for the developer than the jumbo build - this is where
>>>>>>> modules would help as well, since it doesn't have this tradeoff property of
>>>>>>> two different dimensions of "more work" you have to choose from.
>>>>>>>
>>>>>>
>>>>>> There are ways to minimise this- an earlier proprietary jumbo build
>>>>>> system used at Opera would detect when you're modifying and rebuilding
>>>>>> files, and compile these in "normal" mode.  This gave fast full/clean build
>>>>>> times but also short modify+rebuild times.  We have not attempted to
>>>>>> implement this in the Chromium Jumbo build configuration.
>>>>>>
>>>>>
>>>>> Building that kind of infrastructure seems like a pretty big hammer
>>>>> compared to modularizing the codebase...
>>>>>
>>>>
>>>> Modularizing the codebase doesn't give you the same build time impact,
>>>> linearizes your build more,
>>>>
>>>
>>> Not sure I follow - it partially linearizes (as you say, due to the
>>> module dependency rather than header dependency issue), as does the jumbo
>>> build.
>>>
>>
>> The jumbo build just needs to append a bunch of files, that's fast.
>> Compiling a module isn't.
>>
>
> Well, compiling a module is just appending a bunch of headers and
> compiling them. It's just at a different layer of the graph.
>
>
>> and slows down incremental builds.
>>>>
>>>
>>> Compared to a traditional build? I wouldn't think so (I mean, yes,
>>> reading/writing modules has some overhead - but also some gains) on
>>> average. I'd expect slower builds if you modify a header at the very base
>>> of the dependency (the STL), but beyond that I would've thought the
>>> reading/writing modules overhead would be saved by reusing modules for
>>> infrequently modified files (like the STL).
>>>
>>
>> Say you touch some header foo.h. Previously, you needed to rebuild all cc
>> files including it. Now you need to instead rebuild the module, and since
>> the module has changed you now need to rebuild all cc files using any
>> header in the module, not just the users of foo.h. That's potentially way
>> more cc files.
>>
>
> But say you touch some source file foo.cc. Previously, and with modules,
> you just need to rebuild that cc file. With a unity build, you now instead
> need to rebuild the concatenation of that .cc file and a bunch of others.
> That's also potentially way more cc files. :)
>
> But measurements beat speculation here.
>

Here's one data point: on a non-ccache, non-distributed build on a fairly
high end machine (20 CPU cores, 40 threads), I built a subset of Chromium
(content_shell) in both jumbo and non-jumbo mode.  Then I picked a single
source file that is in part of the tree that we have previously made
jumbo-capable (content/public/renderer/browser_plugin_delegate.cc), touched
it and timed how long the rebuilds would take in both jumbo and non-jumbo
mode.  The target that this source file is part of has 16 source files in
total. which is smaller than the default jumbo_merge_file_limit value of
50, so to rebuild this one source file in jumbo mode requires that we also
rebuild the other 15 source files in this target, which will not be done in
parallel since they're all in a single jumbo compilation unit- in other
words this is a moderately bad scenario for jumbo.

The non-jumbo rebuild + relink time on this machine was between 9 and 10
seconds, and the jumbo rebuild + relink time was 23-24 seconds- a little
more than double, but still nowhere near "time to grab a coffee while I
wait" territory.  This time is easily won back in jumbo mode if you need to
rebase on master, or build another target or configuration.

If you find yourself in a modify/rebuild/retest loop in this code, you can
try a workflow optimisation mentioned in Daniel Bratell's doc (and earlier
in this thread): turn jumbo off for just this target but on for all others,
and you only have a one-time overhead of regenerating ninja files (which is
quick) plus rebuilding 15 source files once in parallel.  Then you only
need to rebuild a single source file each time around the loop.

I am currently running the same benchmark on a lower-specced machine, one
which is more realistic for many developers: a 4 core / 8 thread CPU
workstation, but the test setup is excruciatingly slow to prepare so I will
have to report back tomorrow with the numbers.  I expect the rebuild times
to be comparable, since this test cannot make use of multiple CPU cores
simultaneously (other than maybe parallel linking).  But the clean-build
time speedup for this configuration is known to be a big net win in terms
of absolute time saved (jumbo builds something like ~3x faster than
non-jumbo which take several hours).

Jumbo builds are not a solution that you should use blindly without
confirming that they work for your codebase and workflow, but in some cases
they clearly have enormous benefits.

-Mostyn.

> (wonder what the combination would be like - modularizing headers, and
>>> also jumbo-ifying .cpp files together... - whether there's much to be saved
>>> in the reading modules part of the work, reading them in fewer times - that
>>> gets into some of the ideas of compiler as a service I guess)
>>>
>>>
>>>> Even if it wasn't a lot more work to get modules going, it's not
>>>> completely clear to me that that would address the use case that the people
>>>> working on the jumbo build have.
>>>>
>>>>
>>>>> (maybe still less work - but a lot of work to workaround things &
>>>>> produce some rather quirky behavior (in terms of how the build functions
>>>>> based on looking at exactly how the source files have changed & changing
>>>>> the build action graph depending on that) - but enough that I'd be inclined
>>>>> to reconsider going in the modular direction again)
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> * I was going to ask about the lack of parallelism in a jumbo build
>>>>>>> - but reading the doc I see it's not a 'full' jumbo build, but chunkifying
>>>>>>> the build - so there's still some/enough parallelism. Cool :)
>>>>>>>
>>>>>>
>>>>>> I have heard rumours of some codebases in the games industry using a
>>>>>> single jumbo source file for the entire build, but this is generally
>>>>>> considered to be taking things too far and not our intended use case.
>>>>>>
>>>>>
>>>>> Ah, my understanding was that jumbo builds were often/mainly used for
>>>>> optimized builds to get cross-module optimizations (LTO-esque) & so it'd be
>>>>> likely to be the whole program.
>>>>>
>>>>>
>>>>>> The size of Chromium's jumbo compilation units is tunable- you can
>>>>>> simply #include fewer real source files per jumbo source file- the bigger
>>>>>> your build farm is, the smaller you want this number to be.  The optimal
>>>>>> setup depends on things like the shape of the dependency graph and the
>>>>>> relative costs of the original source files.  IIRC we currently only have
>>>>>> build-wide "jumbo_file_merge_limit" setting, though that might have changed
>>>>>> since I last looked (V8 would benefit from this, since its source files
>>>>>> compile more slowly than most Chromium source files).
>>>>>>
>>>>>>
>>>>>> -Mostyn.
>>>>>>
>>>>>>
>>>>>>> On Tue, Apr 10, 2018 at 5:12 AM Mostyn Bramley-Moore via cfe-dev <
>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Hi,I am a member of a small group of Chromium developers who are
>>>>>>>> working on adding a unity build[1] setup to Chromium[2], in order to reduce
>>>>>>>> the project's long and ever-increasing compile times.  We're calling these
>>>>>>>> "jumbo" builds, because this term is not as overloaded as "unity".We're
>>>>>>>> slowly making progress, but find that a lot of our time is spent renaming
>>>>>>>> things in anonymous namespaces- it would be much simpler if it was possible
>>>>>>>> to automatically treat these as if they were file-local.   Jens Widell has
>>>>>>>> put together a proof-of-concept which appears to work reasonably well, it
>>>>>>>> consists of a clang plugin and a small clang
>>>>>>>> patch:https://github.com/jensl/llvm-project-20170507/tree/wip/jumbo-support/v1
>>>>>>>> <https://github.com/jensl/llvm-project-20170507/tree/wip/jumbo-support/v1>https://github.com/jensl/llvm-project-20170507/commit/a00d5ce3f20bf1c7a41145be8b7a3a478df9935f
>>>>>>>> <https://github.com/jensl/llvm-project-20170507/commit/a00d5ce3f20bf1c7a41145be8b7a3a478df9935f>After
>>>>>>>> building clang and the plugin, you generate jumbo source files that look
>>>>>>>> like:jumbo_source_1.cc:#pragma jumbo#include
>>>>>>>> "real_source_file_1.cc"#include "real_source_file_2.cc"#include
>>>>>>>> "real_source_file_3.cc"Then, you compile something like this:clang++ -c
>>>>>>>> jumbo_source_1.cc -Xclang -load -Xclang lib/JumboSupport.so -Xclang
>>>>>>>> -add-plugin -Xclang jumbo-supportThe plugin gives unique names[3] to the
>>>>>>>> anonymous namespaces without otherwise changing their semantics, and also
>>>>>>>> #undef's macros defined in each top-level source file before processing the
>>>>>>>> next top-level source file.  That way header files can still define macros
>>>>>>>> that are used in multiple source files in the jumbo translation unit.
>>>>>>>> Collisions between macros defined in header files and names used in other
>>>>>>>> headers and other source files are still possible, but less likely.To show
>>>>>>>> how much these two changes help, here's a patch to make Chromium's network
>>>>>>>> code build in jumbo
>>>>>>>> mode:https://chromium-review.googlesource.com/c/chromium/src/+/966523
>>>>>>>> <https://chromium-review.googlesource.com/c/chromium/src/+/966523>
>>>>>>>> (+352/-377 lines)And here's the corresponding patch using the
>>>>>>>> proof-of-concept JumboSupport
>>>>>>>> plugin:https://chromium-review.googlesource.com/c/chromium/src/+/962062
>>>>>>>> <https://chromium-review.googlesource.com/c/chromium/src/+/962062> (+53/-52
>>>>>>>> lines)It seems clear that the version using the JumboSupport plugin would
>>>>>>>> require less effort to create, review and merge into the codebase.  We have
>>>>>>>> a few other feature ideas, but these two changes seem to do most of the
>>>>>>>> work for us.So now we're trying to figure out the best way forward- would a
>>>>>>>> feature like this be welcome to the Clang project?  And if so, how would
>>>>>>>> you recommend that we go about it? We would prefer to do this in a way that
>>>>>>>> does not require a locally patched Clang and could live with building a
>>>>>>>> custom plugin, although implementing this entirely in Clang would be even
>>>>>>>> better.*
>>>>>>>>
>>>>>>>
> I've been thinking about ways to get the benefits of unity builds without
> the semantic changes. With the functionality we introduced for
> -fmodules-local-submodule-visibility, we have the abililty to parse one
> file, then make it "invisible" and parse another file, skipping all the
> repeated parts from the two parses, which would give us some (maybe most)
> of the performance benefit of unity builds without the semantic changes.
> (This is not quite as good as a unity build: you'd still repeatedly lex and
> preprocess the files #included into both source files. We could implicitly
> treat header files with include guards as being "modular" to get the
> performance back, but then you also get back some of the semantic changes.)
>
>
>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Thanks,-Mostyn.[1] If you're not familiar with unity builds, the
>>>>>>>> idea is to compile multiple source files per compiler invocation, reducing
>>>>>>>> the overhead of processing header files (which can be surprisingly high).
>>>>>>>> We do this by taking a list of the source files in a target and generating
>>>>>>>> "jumbo" source files that #include multiple "real" source files, and then
>>>>>>>> we feed these jumbo files to the compiler one at a time.  This way, we
>>>>>>>> don't prevent the usage of valuable build tools like ccache and icecc that
>>>>>>>> only support a single source file on the command line.[2] Daniel Bratell
>>>>>>>> has a summary of our progress jumbo-ifying the Chromium codebase
>>>>>>>> here:https://docs.google.com/document/d/19jGsZxh7DX8jkAKbL1nYBa5rcByUL2EeidnYsoXfsYQ/edit#
>>>>>>>> <https://docs.google.com/document/d/19jGsZxh7DX8jkAKbL1nYBa5rcByUL2EeidnYsoXfsYQ/edit#>[3]
>>>>>>>> The JumboSupport plugin assigns names to the anonymous namespaces in a
>>>>>>>> given file:  foo::(anonymous namespace)::bar is replaced with a symbol name
>>>>>>>> of the form foo::__anonymous_<number>::bar where <number> is unique to the
>>>>>>>> file within the jumbo translation unit.  Due to the internal linkage of
>>>>>>>> these symbols, <number> does not need to be unique across multiple object
>>>>>>>> files/jumbo source files.*
>>>>>>>> --
>>>>>>>> Mostyn Bramley-Moore
>>>>>>>> Vewd Software
>>>>>>>> mostynb at vewd.com <mostynb at opera.com>
>>>>>>>> _______________________________________________
>>>>>>>> cfe-dev mailing list
>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Mostyn Bramley-Moore
>>>>>> Vewd Software
>>>>>> mostynb at vewd.com <mostynb at opera.com>
>>>>>>
>>>>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>

-- 
Mostyn Bramley-Moore
Vewd Software
mostynb at vewd.com <mostynb at opera.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180411/427e2185/attachment.html>