[LLVMdev] [Polly] Compile-time and Execution-time analysis for the SCEV canonicalization
tanmx_star at yeah.net
Sun Sep 8 11:03:45 PDT 2013
I have done some basic experiments about Polly canonicalization passes and I found the SCEV canonicalization has significant impact on both compile-time and execution-time performance.
Detailed results for SCEV and default canonicalization can be viewed on: http://126.96.36.199:8000/db_default/v4/nts/32 (or 33, 34)
*pNoGen with SCEV canonicalization (run 32): -O3 -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm -polly-optimizer=none -mllvm -polly-code-generator=none -mllvm -polly-codegen-scev
*pNoGen with default canonicalization (run 33): -O3 -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm -polly-optimizer=none -mllvm -polly-code-generator=none
*pBasic without any canonicalization (run 34): -O3 -Xclang -load -Xclang LLVMPolly.so
Impact of SCEV canonicalization:
Impact of default canonicalization:
Comparison of SCEV canonicalization with default canonicalization:
As we expected, both SCEV canonicalization and default canonicalization will slightly increase the compile-time overhead (at most 30% extra compile-time). They also lead to some execution-time regressions and improvements.
The only difference between SCEV canonicalization and default canonicalization is the "IndVarSimplify" pass as shown in the code RegisterPasses.cpp:212:
However, I find it is interesting to look into the comparison between SCEV canonicalization and default canonicalization (http://188.8.131.52:8000/db_default/v4/nts/32?compare_to=33&baseline=33):
First of all, we can expect SCEV canonicalization has better compile-time performance since it avoids the "IndVarSimplify" pass. Actually, it can gain more than 5% compile-time performance improvement for 32 benchmarks, especially for the following benchmarks:
Second, we find that SCEV canonicalization has both regression and improvement of execution performance compared with default canonicalization. Actually, there are many execution-time regressions such as:
as well as many execution-time improvements such as:
I think the execution-time performance regression is mainly because of the unexpected performance improvements from non-SCEV canonicalization as shown int eh following bug: http://llvm.org/bugs/show_bug.cgi?id=17153. I will try to find out why "IndVarSimplify" can produce better code in the next step. If we can eliminate "IndVarSimplify" canonicalization but keep on producing high-quality code, then we can gain better compile-time performance without execution-time performance loss.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev