[LLVMdev] [Proposal] Parallelize post-IPO stage.

Mon Jul 15 10:16:24 PDT 2013

On 7/14/13 5:56 PM, Andrew Trick wrote:
>
> On Jul 12, 2013, at 3:49 PM, Shuxin Yang <shuxin.llvm at gmail.com 
> <mailto:shuxin.llvm at gmail.com>> wrote:
>
>> 3.2 Compile partitions independently
>> --------------------------------------
>>
>>   There are two camps: one camp advocate compiling partitions via 
>> multi-process,
>> the other one favor multi-thread.
>>
>>  Inside Apple compiler teams, I'm the only one belong to the 1st 
>> comp. I think
>> while multi-proc sounds bit red-neck, it has its advantage for this 
>> purpose, and
>> while multi-thread is certainly more eye-popping, it has its advantage
>> as well.
>>
>>  The advantage of multi-proc are:
>>  1) easier to implement, the process run in its own address space.
>>    We don't need to worry about they can interfere with each other.
>>
>>  2)huge, or not unlimited, address space.
>>
>>   The disadvantage is that it's expensive. But I guess the cost is
>>  almost negligible compared to the overall IPO compilation.
>>
>>  The advantage of multi-threads I can imagine are:
>>   1) sound fancy
>>   2) it is light-weight
>>   3) inter-thread communication is easier than IPC.
>>
>>  Its disadvantage are:
>>   1). Oftentime we will come across race-condition, and it took
>>      awful long time to figure it out. While the code is supposed
>>      to be mult-thread safe, we might miss some tricky case.
>>      Trouble-shooting race condition is a nightmare.
>>
>>   2) Small address space. This is big problem if we the compiler
>>      is built 32-bit . In that case, the compiler is not able to bring
>>      lots of stuff in memory even if the HW dose
>>      provide ample mem.
>>
>>   3) The thread-safe run-time lib is more expensive.
>>      I once linked a compiler using -lpthread (I dose not have to) on a
>>      UNIX platform,  and saw the compiler slow down by about 1/3.
>>
>>    I'm not able to convince the folks in other camp, neither are they
>> able to convince me. I decide to implement both. Fortunately, this
>> part is not difficult, it seems to be rather easy to crank out one 
>> within short
>> period of time. It would be interesting to compare them side-by-side,
>> and see which camp lose:-). On the other hand, if we run into 
>> race-condition
>> problem, we choose multi-proc version as a fall-back.
>
> While I am a self-proclaimed multi-process red-neck, in this case I 
> would prefer to see a multi-threaded implementation because I want to 
> verify that LLVMContext can be used as advertised. I'm sure some extra 
> care will be needed to report failures/diagnostics, but we should 
> start with the assumption that this approach is not significantly 
> harder than multi-process because that's how we advertise the design.
>
> If any of the multi-threaded disadvantages you point out are real, I 
> would like to find out about it.
>
> 1. Race Conditions: We should be able to verify that the 
> thread-parallel vs. sequential or multi-process compilation generate 
> the same result. If they diverge, we would like to know about the bug 
> so it can be fixed--independent of LTO.
>
> 2. Small Address Space with LTO. We don't need to design around this 
> hypothetical case.
>
> 3. Expensive thread-safe runtime lib. We should not speculate that 
> platforms that we, as the LLVM community, care about have this 
> problem. Let's assume that our platforms are well implemented unless 
> we have data to the contrary. (Personally, I would even love to use 
> TLS in the compiler to vastly simplify API design in the backend, but 
> I am not going to be popular for saying so).
>
> We should be able to decompose each step of compilation for debugging.
Yes, of course, we should be able to save the IR before and after each 
major steps, and when
trouble-shooting, we should be able to focus on one smaller partition. 
Once the problem of
of the partition is fixed, we can manually link all the partition and 
other libs into final
executable/dyn-lib.

This is one important reasons for partitioning.

> So the multi-process "implementation" should just be a degenerate form 
> of threading with a bit of driver magic if you want to automate it.
>
>
Yes!  That is why I'd like implement both.

There is one difference though, with multi-proc implementation, we need 
to pass the /path/to/{llc/opt} to the libLTO.{so|dylib}, such
that it can invoke these tools from right place. While in multi-thread 
implementation, we don't need this info.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130715/713f532f/attachment.html>