[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead
tanmx_star at yeah.net
Sun Mar 17 21:54:18 PDT 2013
I am interested in Polly project. Polly seems to be a very promising tool to find out program parallelization based on LLVM infrastructure. However, I find that Polly analysis and optimization can consume significant compiling time, so I propose a GSoC project to reduce Polly compiling time and I hope my work can make the Polly tool more applicable for all LLVM users.
I have done some preliminary experiments. My experimental environment is built as follows:
First, I build the LLVM, Clang and Polly using -O3 option, so all of these tools can be run in best case;
Second, I evaluate the compiling time of Polybench using the following options:
Clang-O3: clang -O3 (the basic clang without polly)
Polly-basic: clang -Xclang -load -Xclang LLVMPolly.so -O3 (load polly but no use of polly optimization)
Polly-optimize: clang -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -O3 (use polly optimization)
The preliminary experimental results are as follows: (benchmarks are collected from Po
(seconds) | Polly-basic
(seconds) | Polly-optimize
(seconds) | Polly-load overhead | Polly-optimize
| 2mm.c | 0.786 | 0.802 | 1.944 | 2.0% | 147.3% |
| correlation.c | 0.782 | 0.792 | 2.005 | 1.3% | 156.4% |
| gesummv.c | 0.583 | 0.603 | 1.08 | 3.4% | 85.2% |
| ludcmp.c | 0.787 | 0.806 | 2.475 | 2.4% | 214.5% |
| 3mm.c | 0.786 | 0.811 | 2.617 | 3.2% | 233.0% |
| covariance.c | 0.73 | 0.74 | 2.294 | 1.4% | 214.2% |
| gramschmidt.c | 0.63 | 0.643 | 1.134 | 2.1% | 80.0% |
| seidel.c | 0.632 | 0.645 | 2.036 | 2.1% | 222.2% |
| adi.c | 0.8 | 0.811 | 3.044 | 1.4% | 280.5% |
| doitgen.c | 0.742 | 0.752 | 2.32 | 1.3% | 212.7% |
| instrument.c | 0.445 | 0.45 | 0.495 | 1.1% | 11.2% |
| atax.c | 0.614 | 0.627 | 1.007 | 2.1% | 64.0% |
| gemm.c | 0.721 | 0.74 | 1.327 | 2.6% | 84.0% |
| jacobi-2d-imper.c | 0.721 | 0.735 | 2.211 | 1.9% | 206.7% |
| bicg.c | 0.577 | 0.597 | 1.01 | 3.5% | 75.0% |
| gemver.c | 0.799 | 0.857 | 1.296 | 7.3% | 62.2% |
| lu.c | 0.68 | 0.702 | 1.132 | 3.2% | 66.5% |
| Average |
| 2.49% | 142.10% |
Experimental results show that Polly analysis and optimization can leads to 142% extra compiling overhead, which maybe unacceptable in many large software building. As a result, it is an urgent task to reduce the compiling time of Polly analysis and optimization.
My plan for this proposal is to reduce Polly compiling overhead step by step:
1) Investigate the details of Polly, find out how much time spent on analysis and how much time spent on optimization. Of course it is very important to distinguish the overhead on codes where Polly is applicable and codes where Polly is not applicable.
2) Profile the Polly to find out the hot code and find out the sources of compiling overhead; Based on the profiling, I will try to rewrite the hot code to improving the compiling process.
3) Remove expensive analysis passes. For example, the scop detection currently requires both the post-dominance analysis
as well as the dominance frontier analysis. Not requiring these up front (at all) would be great.
4) Canonicalization passes scheduled before Polly. Before running Polly, we currently schedule a set of passes to canonicalize the LLVM-IR on which the scop detection is run on. If I can reduce the number of preparing passes, then the compiling overhead can be reduced.
5) Find out other ways to reduce compiling overhead.
Hope you can give me further advices.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev