[LLVMdev] [Proposal] Parallelize post-IPO stage.

Xinliang David Li xinliangli at gmail.com
Tue Jul 16 13:35:02 PDT 2013

A third approach is to decouple the backend compilation and
parallelism strategy from the partitioning.  The partitioning can
spits out partition BC files and some action records in some standard
format. All of this can be fed into some driver tools that converts
the compilation action file into make/build file of the underlying
build system of your choice:

1) it can simply a compiler driver that does thread level parallelism;
2) or a tool that generates Makfiles which are fed into parallel make
to explore single node parallelism;
3) or a tool that generates BUILD files that feed into distributed
build system (such as Google's blaze:

Another benefit is it will make compiler debugging easier.



On Sun, Jul 14, 2013 at 5:56 PM, Andrew Trick <atrick at apple.com> wrote:
> On Jul 12, 2013, at 3:49 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
> 3.2 Compile partitions independently
> --------------------------------------
>   There are two camps: one camp advocate compiling partitions via
> multi-process,
> the other one favor multi-thread.
>  Inside Apple compiler teams, I'm the only one belong to the 1st comp. I
> think
> while multi-proc sounds bit red-neck, it has its advantage for this purpose,
> and
> while multi-thread is certainly more eye-popping, it has its advantage
> as well.
>  The advantage of multi-proc are:
>  1) easier to implement, the process run in its own address space.
>    We don't need to worry about they can interfere with each other.
>  2)huge, or not unlimited, address space.
>   The disadvantage is that it's expensive. But I guess the cost is
>  almost negligible compared to the overall IPO compilation.
>  The advantage of multi-threads I can imagine are:
>   1) sound fancy
>   2) it is light-weight
>   3) inter-thread communication is easier than IPC.
>  Its disadvantage are:
>   1). Oftentime we will come across race-condition, and it took
>      awful long time to figure it out. While the code is supposed
>      to be mult-thread safe, we might miss some tricky case.
>      Trouble-shooting race condition is a nightmare.
>   2) Small address space. This is big problem if we the compiler
>      is built 32-bit . In that case, the compiler is not able to bring
>      lots of stuff in memory even if the HW dose
>      provide ample mem.
>   3) The thread-safe run-time lib is more expensive.
>      I once linked a compiler using -lpthread (I dose not have to) on a
>      UNIX platform,  and saw the compiler slow down by about 1/3.
>    I'm not able to convince the folks in other camp, neither are they
> able to convince me. I decide to implement both. Fortunately, this
> part is not difficult, it seems to be rather easy to crank out one within
> short
> period of time. It would be interesting to compare them side-by-side,
> and see which camp lose:-). On the other hand, if we run into race-condition
> problem, we choose multi-proc version as a fall-back.
> While I am a self-proclaimed multi-process red-neck, in this case I would
> prefer to see a multi-threaded implementation because I want to verify that
> LLVMContext can be used as advertised. I'm sure some extra care will be
> needed to report failures/diagnostics, but we should start with the
> assumption that this approach is not significantly harder than multi-process
> because that's how we advertise the design.
> If any of the multi-threaded disadvantages you point out are real, I would
> like to find out about it.
> 1. Race Conditions: We should be able to verify that the thread-parallel vs.
> sequential or multi-process compilation generate the same result. If they
> diverge, we would like to know about the bug so it can be fixed--independent
> of LTO.
> 2. Small Address Space with LTO. We don't need to design around this
> hypothetical case.
> 3. Expensive thread-safe runtime lib. We should not speculate that platforms
> that we, as the LLVM community, care about have this problem. Let's assume
> that our platforms are well implemented unless we have data to the contrary.
> (Personally, I would even love to use TLS in the compiler to vastly simplify
> API design in the backend, but I am not going to be popular for saying so).
> We should be able to decompose each step of compilation for debugging. So
> the multi-process "implementation" should just be a degenerate form of
> threading with a bit of driver magic if you want to automate it.
> -Andy
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

More information about the llvm-dev mailing list