[cfe-dev] JumboSupport: making unity builds easier in Clang

Nico Weber via cfe-dev cfe-dev at lists.llvm.org
Tue Apr 10 09:58:13 PDT 2018


On Tue, Apr 10, 2018 at 11:56 AM, David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Tue, Apr 10, 2018 at 8:52 AM Mostyn Bramley-Moore <mostynb at vewd.com>
> wrote:
>
>> On Tue, Apr 10, 2018 at 4:27 PM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>>
>>> I haven't looked at the patches in detail - but generally a jumbo build
>>> feels like a bit of a workaround & maybe there are better long-term
>>> solutions that might fit into the compiler. A few sort of background
>>> questions:
>>>
>>> * Have you tried Clang header modules ( https://clang.llvm.org/docs/
>>> Modules.html )? (explicit (granted, explicit might only be practical at
>>> the moment using Google's internal version of Bazel - but you /might/ get
>>> some comparison numbers from a Google Chrome developer) and implicit)
>>>   * The doc talks about maybe disabling jumbo builds for a single target
>>> for developer efficiency, with the risk that a header edit would maybe be
>>> worse for the developer than the jumbo build - this is where modules would
>>> help as well, since it doesn't have this tradeoff property of two different
>>> dimensions of "more work" you have to choose from.
>>>
>>
>> There are ways to minimise this- an earlier proprietary jumbo build
>> system used at Opera would detect when you're modifying and rebuilding
>> files, and compile these in "normal" mode.  This gave fast full/clean build
>> times but also short modify+rebuild times.  We have not attempted to
>> implement this in the Chromium Jumbo build configuration.
>>
>
> Building that kind of infrastructure seems like a pretty big hammer
> compared to modularizing the codebase...
>

Modularizing the codebase doesn't give you the same build time impact,
linearizes your build more, and slows down incremental builds. Even if it
wasn't a lot more work to get modules going, it's not completely clear to
me that that would address the use case that the people working on the
jumbo build have.


> (maybe still less work - but a lot of work to workaround things & produce
> some rather quirky behavior (in terms of how the build functions based on
> looking at exactly how the source files have changed & changing the build
> action graph depending on that) - but enough that I'd be inclined to
> reconsider going in the modular direction again)
>
>
>>
>>
>>> * I was going to ask about the lack of parallelism in a jumbo build -
>>> but reading the doc I see it's not a 'full' jumbo build, but chunkifying
>>> the build - so there's still some/enough parallelism. Cool :)
>>>
>>
>> I have heard rumours of some codebases in the games industry using a
>> single jumbo source file for the entire build, but this is generally
>> considered to be taking things too far and not our intended use case.
>>
>
> Ah, my understanding was that jumbo builds were often/mainly used for
> optimized builds to get cross-module optimizations (LTO-esque) & so it'd be
> likely to be the whole program.
>
>
>> The size of Chromium's jumbo compilation units is tunable- you can simply
>> #include fewer real source files per jumbo source file- the bigger your
>> build farm is, the smaller you want this number to be.  The optimal setup
>> depends on things like the shape of the dependency graph and the relative
>> costs of the original source files.  IIRC we currently only have build-wide
>> "jumbo_file_merge_limit" setting, though that might have changed since I
>> last looked (V8 would benefit from this, since its source files compile
>> more slowly than most Chromium source files).
>>
>>
>> -Mostyn.
>>
>>
>>> On Tue, Apr 10, 2018 at 5:12 AM Mostyn Bramley-Moore via cfe-dev <
>>> cfe-dev at lists.llvm.org> wrote:
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *Hi,I am a member of a small group of Chromium developers who are
>>>> working on adding a unity build[1] setup to Chromium[2], in order to reduce
>>>> the project's long and ever-increasing compile times.  We're calling these
>>>> "jumbo" builds, because this term is not as overloaded as "unity".We're
>>>> slowly making progress, but find that a lot of our time is spent renaming
>>>> things in anonymous namespaces- it would be much simpler if it was possible
>>>> to automatically treat these as if they were file-local.   Jens Widell has
>>>> put together a proof-of-concept which appears to work reasonably well, it
>>>> consists of a clang plugin and a small clang
>>>> patch:https://github.com/jensl/llvm-project-20170507/tree/wip/jumbo-support/v1
>>>> <https://github.com/jensl/llvm-project-20170507/tree/wip/jumbo-support/v1>https://github.com/jensl/llvm-project-20170507/commit/a00d5ce3f20bf1c7a41145be8b7a3a478df9935f
>>>> <https://github.com/jensl/llvm-project-20170507/commit/a00d5ce3f20bf1c7a41145be8b7a3a478df9935f>After
>>>> building clang and the plugin, you generate jumbo source files that look
>>>> like:jumbo_source_1.cc:#pragma jumbo#include
>>>> "real_source_file_1.cc"#include "real_source_file_2.cc"#include
>>>> "real_source_file_3.cc"Then, you compile something like this:clang++ -c
>>>> jumbo_source_1.cc -Xclang -load -Xclang lib/JumboSupport.so -Xclang
>>>> -add-plugin -Xclang jumbo-supportThe plugin gives unique names[3] to the
>>>> anonymous namespaces without otherwise changing their semantics, and also
>>>> #undef's macros defined in each top-level source file before processing the
>>>> next top-level source file.  That way header files can still define macros
>>>> that are used in multiple source files in the jumbo translation unit.
>>>> Collisions between macros defined in header files and names used in other
>>>> headers and other source files are still possible, but less likely.To show
>>>> how much these two changes help, here's a patch to make Chromium's network
>>>> code build in jumbo
>>>> mode:https://chromium-review.googlesource.com/c/chromium/src/+/966523
>>>> <https://chromium-review.googlesource.com/c/chromium/src/+/966523>
>>>> (+352/-377 lines)And here's the corresponding patch using the
>>>> proof-of-concept JumboSupport
>>>> plugin:https://chromium-review.googlesource.com/c/chromium/src/+/962062
>>>> <https://chromium-review.googlesource.com/c/chromium/src/+/962062> (+53/-52
>>>> lines)It seems clear that the version using the JumboSupport plugin would
>>>> require less effort to create, review and merge into the codebase.  We have
>>>> a few other feature ideas, but these two changes seem to do most of the
>>>> work for us.So now we're trying to figure out the best way forward- would a
>>>> feature like this be welcome to the Clang project?  And if so, how would
>>>> you recommend that we go about it? We would prefer to do this in a way that
>>>> does not require a locally patched Clang and could live with building a
>>>> custom plugin, although implementing this entirely in Clang would be even
>>>> better.Thanks,-Mostyn.[1] If you're not familiar with unity builds, the
>>>> idea is to compile multiple source files per compiler invocation, reducing
>>>> the overhead of processing header files (which can be surprisingly high).
>>>> We do this by taking a list of the source files in a target and generating
>>>> "jumbo" source files that #include multiple "real" source files, and then
>>>> we feed these jumbo files to the compiler one at a time.  This way, we
>>>> don't prevent the usage of valuable build tools like ccache and icecc that
>>>> only support a single source file on the command line.[2] Daniel Bratell
>>>> has a summary of our progress jumbo-ifying the Chromium codebase
>>>> here:https://docs.google.com/document/d/19jGsZxh7DX8jkAKbL1nYBa5rcByUL2EeidnYsoXfsYQ/edit#
>>>> <https://docs.google.com/document/d/19jGsZxh7DX8jkAKbL1nYBa5rcByUL2EeidnYsoXfsYQ/edit#>[3]
>>>> The JumboSupport plugin assigns names to the anonymous namespaces in a
>>>> given file:  foo::(anonymous namespace)::bar is replaced with a symbol name
>>>> of the form foo::__anonymous_<number>::bar where <number> is unique to the
>>>> file within the jumbo translation unit.  Due to the internal linkage of
>>>> these symbols, <number> does not need to be unique across multiple object
>>>> files/jumbo source files.*
>>>> --
>>>> Mostyn Bramley-Moore
>>>> Vewd Software
>>>> mostynb at vewd.com <mostynb at opera.com>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>
>>>
>>
>>
>> --
>> Mostyn Bramley-Moore
>> Vewd Software
>> mostynb at vewd.com <mostynb at opera.com>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180410/25771b9a/attachment.html>


More information about the cfe-dev mailing list