[llvm-dev] Proposal/patch: simple parallel LTO code generation
Peter Collingbourne via llvm-dev
llvm-dev at lists.llvm.org
Wed Aug 12 01:52:49 PDT 2015
The most time consuming part of LTO at opt level 1 is by far the backend code
generator. (As a reminder, LTO opt level 1 runs a minimal set of passes;
it is most useful where the motivation behind the use of LTO is to deploy
a transformation that requires whole program visibility such as control
flow integrity , rather than to optimise the program using whole program
visibility). Code generation is in principle embarrassingly parallel, as it
can in principle be partitioned at the function granularity level, however
there are practical issues that need to be solved before we can parallelise
code generation for LTO.
The main issue is that the backend currently makes no effort to be thread safe.
This can be overcome by observing that it is unnecessary for the backend to
be thread safe if we arrange for each instance of the backend to operate in
a different LLVMContext. This is the approach that this patch proposes. The
LTO code generator partitions the combined LTO module into sub-modules, each
with its own LLVMContext, and runs the code generator on the sub-modules
in parallel. (Entities in the combined module are partitioned by taking
the modulus of the hash of the name of the entity, or its comdat if it has
one.) The resulting native object files can be combined by the linker in
the usual way.
This approach is reasonably effective. In one experiment, an LTO link of
Chromium at LTO opt level 1 on an HP Z620 machine took 15m20s without
parallelism, 8m06s with 4 partitions and 7m27s with 8 partitions.
I've attached a patch with an initial implementation of this idea for the
gold plugin. If this idea seems reasonable, I'll proceed to clean up the
patch and send it for review on llvm-commits.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 14039 bytes
Desc: not available
More information about the llvm-dev