[LLVMdev] [LLVM Dev] [Discussion] Function-based parallel LLVM backend code generation
Shuxin Yang
shuxin.llvm at gmail.com
Tue Jul 16 11:02:52 PDT 2013
In addition to the concerns Chandler figure out,
I'm curious about :
execution time of pristine-llc vs "modified-llc with -thd=1", and
the exec-time of pristine-clang vs clang-linked-with-the-modified-llc.
Thanks
On 7/16/13 3:46 AM, Chandler Carruth wrote:
> While I think the end goal you're describing is close to the correct
> one, I see the high-level strategy for getting there somewhat
> differently:
>
> 1) The code generators are only one collection of function passes that
> might be parallelized. Many others might also be parallelized
> profitably. The design for parallelism within LLVM's pass management
> infrastructure should be sufficiently generic to express all of these
> use cases.
>
> 2) The idea of having multiple pass managers necessitates (unless I
> misunderstand) duplicating a fair amount of state. For example, the
> caches in immutable analysis passes would no longer be shared, etc. I
> think that is really unfortunate, and would prefer instead to use
> parallelizing pass managers that are in fact responsible for the
> scheduling of passes.
>
> 3) It doesn't provide a strategy for parallelizing the leaves of a
> CGSCC pass manager which is where a significant portion of the
> potential parallelism is available within the middle end.
>
> 4) It doesn't deal with the (numerous) parts of LLVM that are not
> actually thread safe today. They may happen to work with the code
> generators you're happening to test, but there is no guarantee.
> Notable things to think about here are computing new types, the
> use-def lists of globals, commandline flags, and static state
> variables. While our intent has been to avoid problems with the last
> two that could preclude parallelism, it seems unlikely that we have
> succeeded without thorough testing to this point. Instead, I fear we
> have leaned heavily on the crutch of one-thread-per-LLVMContext.
>
> 5) It adds more complexity onto the poorly designed pass manager
> infrastructure. Personally, I think that cleanups to the design and
> architecture of the pass manager should be prioritized above adding
> new functionality like parallelism. However, so far no one has really
> had time to do this (including myself). While I would like to have
> time in the future to do this, as with everything else in OSS, it
> won't be real until the patches start flowing.
>
>
> On Tue, Jul 16, 2013 at 3:33 AM, Wan, Xiaofei <xiaofei.wan at intel.com
> <mailto:xiaofei.wan at intel.com>> wrote:
>
> Hi, community:
>
> For the sake of our business need, I want to enable
> "Function-based parallel code generation" to boost up the
> compilation of single module, please see the details of the design
> and provide your feedbacks on below aspects, thanks!
> 1. Is this idea the proper solution for my requirement
> 2. This new feature will be enabled by llc -thd=N and has no
> impact on original llc when -thd=1
> 3. Can this new feature of llc be accepted by community and merged
> into LLVM code tree
>
> Patches
> The patch is divided into four separated parts, the all-in-one
> patch could be found here:
> http://llvm-reviews.chandlerc.com/D1152
>
> Design
> https://docs.google.com/document/d/1QSkP6AumMCAVpgzwympD5pI3btPJt4SRgjY-vhyfySg/edit?usp=sharing
>
>
> Background
> 1. Our business need to compile C/C++ source files into LLVM IR
> and link them into a big BC file; the big BC file is then compiled
> into binary code on different arch/target devices.
> 2. Backend code generation is a time-consuming activity happened
> on target device which makes it an important user experience.
> 3. Make -j or file based parallelism can't help here since there
> is only one big BC file; function-based parallel LLVM backend code
> generation is a good solution to improve compilation time which
> will fully utilize multi-cores.
>
> Overall design strategy and goal
> 1. Generate totally same binary as what single thread output
> 2. No impacts on single thread performance & conformance
> 3. Little impacts on LLVM code infrastructure
>
> Current status and test result
> 1. Parallel llc can generate same code as single thread by
> "objdump -d", it could pass 10 hours stress test for all
> performance benchmark
> 2. Parallel llc can introduce ~2.9X performance gain on XEON sever
> for 4 threads
>
>
> Thanks
> Wan Xiaofei
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
> http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130716/6ca93bc3/attachment.html>
More information about the llvm-dev
mailing list