[cfe-dev] JumboSupport: making unity builds easier in Clang

Thu Apr 12 07:39:44 PDT 2018

On Wed, Apr 11, 2018 at 8:52 PM, Mostyn Bramley-Moore <mostynb at vewd.com>
wrote:

> On Wed, Apr 11, 2018 at 7:53 PM, Kim Gräsman via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> See also: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Tenseconds.pdf
>
>
> I CC'ed Andy in my initial post, but the email bounced.
>
>
>> I started experimenting with a unity build of an LLVM/Clang-sized
>> proprietary project at my previous employer, and I found the basics
>> easy to get going. The hard part was massaging the code base to avoid
>> collisions, as indicated by the work by Mostyn & co.
>>
>> I left the job before I had a chance to fully evaluate it, but
>> assuming I'd had something like `#pragma jumbo` to reduce the
>> friction, it might have been easier to get more data for less effort.
>>
>> Mostyn/Daniel, do you have any gut feel/data on how much of the
>> problem a #pragma would solve? I suppose there are still constructs
>> that `#pragma jumbo` can't help with, that requires manual
>> intervention?
>>
>
> The best side-by-side comparison that we have at the moment are the two
> chromium patch sets I mentioned- the numbers there match my gut feeling
> that something like the JumboSupport proof-of-concept could save us about
> 80% of the effort to jumbo-ify Chromium code.
>
> There are a few other constructs that cause trouble less often, which
> could be investigated later for diminishing returns.  Automatically popping
> clang diagnostic warning pragma states is one that came up the other day.
> I think I have seen globally scoped typedefs in top-level source files
> cause trouble (but these are rare).
>
> And there are of course some constructs that I don't think are feasible to
> try to fix automatically, eg symbols and macros leaked by library headers
> (which are intentionally leaky)- X11 and Windows headers are particularly
> bad.
>
>
>> Also, Chromium is hardly a typical codebase, the little I've looked at
>> it, it's *extremely* clean and consistent, so it might be interesting
>> to try this on something else. Maybe LLVM itself would be an
>> interesting candidate.
>>
>
> I don't have much experience with CMake, but I see a few references to
> CMake unity build helpers on the web (if anyone has tips, feel free to
> ping me off-list).  If it would be useful I can try to put together a
> small experiment with a subset of LLVM or Clang.
>

I decided to take a look at the clangSema target, and see what kind of
difference the JumboSupport PoC would make.  Instead of digging into CMake,
I just wrote some small shell scripts to build this target in the various
modes.

Without JumboSupport, I had to rename a couple of static functions
(isGlobalVar and getDepthAndIndex in a couple of places), and rename a
struct (PartialSpecMatchResult) that was inside an anonymous namespace.
Alternatively you could decide to refactor and share the same
implementations.  I also excluded two source files from the jumbo
compilation unit, due to clashes caused by a file being intentionally
#include'd multiple times (alternatively you could sprinkle some #undef's
around to make this work).

With JumboSupport, instead of renaming the static functions I just moved
them into anonymous namespaces, and excluded the same two source files
which #include some .def files multiple times, for the same reasons as
above.  I did not need to do anything about the PartialSpecMatchResult
structs since they were already inside anonymous namespaces (one of them
was at least, I did not need to check the other).

Of these two patches, the JumboSupport version was easier to produce, and I
believe would require less effort to review- there would be no debate about
what to rename things, or whether or not the code should be refactored and
how.  I think that anonymous namespaces should generally be preferred over
static functions, and JumboSupport makes anonymous namespaces even more
useful- it makes them behave the way that many developers (incorrectly)
assume that they work.

Note that we don't claim that jumbo builds make sense for all codebases,
and I'm not sure if it would make sense for Clang/LLVM.  But JumboSupport
did appear to help in this tiny experiment.

-Mostyn.

-Mostyn.
>
>
>> - Kim
>>
>> On Wed, Apr 11, 2018 at 7:08 PM, via cfe-dev <cfe-dev at lists.llvm.org>
>> wrote:
>> > If you want to share ASTs (an ephemeral structure) clang would need to
>> do
>> > the distributing.  If you want to share IR of instantiated templates,
>> you
>> > can do a shared database where clang is much less involved in managing
>> the
>> > distribution.  Say the database key can be maybe a hash of the token
>> stream
>> > of the template definition would work?  plus the template parameters.
>> Then
>> > you can pull precompiled IR out of the database (if you want to do
>> > optimizations) or make a reference to it (if you're doing LTO).
>> >
>> > --paulr
>> >
>> >
>> >
>> > From: cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] On Behalf Of
>> David
>> > Blaikie via cfe-dev
>> > Sent: Wednesday, April 11, 2018 11:09 AM
>> > To: David Chisnall
>> > Cc: Bruce Dawson; Daniel Cheng; richard at metafoo.co.uk;
>> > cfe-dev at lists.llvm.org; Daniel Bratell; Jens Widell
>> > Subject: Re: [cfe-dev] JumboSupport: making unity builds easier in Clang
>> >
>> >
>> >
>> > This would have issues with distributed builds, though, right? Unless
>> clang
>> > then took on the burden of doing the distribution too, which might be a
>> bit
>> > much.
>> >
>> > On Wed, Apr 11, 2018 at 12:43 AM David Chisnall via cfe-dev
>> > <cfe-dev at lists.llvm.org> wrote:
>> >
>> > On 10 Apr 2018, at 21:28, Daniel Bratell via cfe-dev
>> > <cfe-dev at lists.llvm.org> wrote:
>> >>
>> >> I've heard (hearsay, I admit) from profiling that it seems the single
>> >> largest time consumer in clang is template instantiation, something I
>> assume
>> >> can't easily be prepared in advance.
>> >>
>> >> One example is chromium's chrome/browser/browser target which is 732
>> files
>> >> that normally need 6220 CPU seconds to compile, average 8,5 seconds per
>> >> file. All combined together gives a single translation unit that takes
>> 400
>> >> seconds to compile, a mere 0.54 seconds on average per file. That
>> indicates
>> >> that about 8 seconds per compiled file is related to the processing of
>> >> headers.
>> >
>> > It sounds as if there are two things here:
>> >
>> > 1. The time taken to parse the headers
>> > 2. The time taken to repeatedly instantiate templates that the linker
>> will
>> > then discard
>> >
>> > Assuming a command line where all of the relevant source files are
>> provided
>> > to the compiler invocation:
>> >
>> > Solving the first one is relatively easy if the files have a common
>> prefix
>> > (which can be determined by simple string comparison).  Find the common
>> > prefix in the source files, build the clang AST, and then do a clone for
>> > each compilation unit.  Hopefully, the clone is a lot cheaper than
>> > re-parsing (and can ideally share source locations).
>> >
>> > The second is slightly more difficult, because it relies on sharing
>> parts of
>> > the AST across notional compilation units.
>> >
>> > To make this work well with incremental builds, ideally you’d spit out
>> all
>> > of the common template instantiations into a separate IR file, which
>> could
>> > then be used with ThinLTO.
>> >
>> > Personally, I would prefer to have an interface where a build system can
>> > invoke clang with all of the files that need building and the degree of
>> > parallelism to use and let it share as much state as it wants across
>> builds.
>> > In an ideal world, clang would record which templates have been
>> instantiated
>> > in a prior build (or a previous build step in the current build) and
>> avoid
>> > any IRGen for them, at the very least.
>> >
>> > Old C++ compilers, predating linker support for COMDATs, emitted
>> templates
>> > lazily, simply emitting references to them, then parsing the linker
>> errors
>> > and generating missing implementations until the linker errors went
>> away.
>> > Modern C++ compilers generate many instantiations of the same templates
>> and
>> > then discard most of them.  It would be nice to find an intermediate
>> point,
>> > which worked well with ThinLTO, where templates could be emitted once
>> and be
>> > available for inlining everywhere.
>> >
>> > David
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > cfe-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > cfe-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
>
>
> --
> Mostyn Bramley-Moore
> Vewd Software
> mostynb at vewd.com <mostynb at opera.com>
>

-- 
Mostyn Bramley-Moore
Vewd Software
mostynb at vewd.com <mostynb at opera.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180412/cfd7e960/attachment.html>