[cfe-dev] [RFC] Embedding compilation database info in object files.

Manuel Klimek klimek at google.com
Fri Jul 19 01:56:40 PDT 2013


On Fri, Jul 19, 2013 at 10:27 AM, David Chisnall <
David.Chisnall at cl.cam.ac.uk> wrote:

> On 19 Jul 2013, at 09:08, Manuel Klimek <klimek at google.com> wrote:
>
> > Well, for example, we can spawn a process that writes to the compilation
> db in the background from the driver, and if we assume that C++
> compilations are long compared to writing a line of text to a file, we get
> that:
>
> So now we get the extra overhead of another fork / exec.  Lots more TLB
> churn, which is often a limiting factor on scalability on modern CPUs.
>

I think when we're talking C++ compilations everything else is dwarfed by
the CPU use of re-parsing transitive include closures.


> > a) finishing off the build will be independent of finishing writing to
> the compilation db, thus, the impact on the build is negligible
>
> Other than having twice as many processes running...
>

clang already spawns a subprocess per clang invocation by the driver. If
that significantly impacted build performance, I'm sure that somebody would
have changed that.


> > b) the compilation db will probably be finished writing before the build
> finishes (as each step probably finishes writing to the compilation db
> before the step finishes)
>
> Yes, the 'probably' bit is fun here.  If your build system is doing the
> 'run the tool' step as a dependency of the build step, you end up with this
> sequence (for -j n):
>
> n compilation tasks finish.
> Build system starts tool.
> Tool acquires lock on compilation database and runs.
> n child processes of the compile sit waiting for the lock.
> Tool finishes.
> n child processes sequentially update compilation database
>
> Even better, some of them will probably succeed, so not only do you have
> the wrong data in the compilation database, you have inconsistent and wrong
> data in the compilation database.
>

I'm not sure we're talking about the same implementation idea. The whole
idea would be to not have anything special in the build system, but to
specify an --update-compilation-db=/my/compildation/db/path flag. Then the
driver would take that flag, and launch a background task to update that
file.

Thus, it would be:
build system launches clang
clang launches:
-> launches the compilation db update process
-> launches the clang -cc1 process
with high probability the tool finishes before clang -cc1 process finishes,
as C++ compilation takes a lot longer than writing to a single file (for
example build disk I/O is also competing for a common resource, parsing CPU
is a common resourc, and there are dependencies inherent in the build,
which make it not fully parallelizable)

Cheers,
/Manuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130719/04c46de1/attachment.html>


More information about the cfe-dev mailing list