[llvm-dev] (RFC) Adjusting default loop fully unroll threshold
Dehao Chen via llvm-dev
llvm-dev at lists.llvm.org
Mon Jan 30 10:49:03 PST 2017
Currently, loop fully unroller shares the same default threshold as loop
dynamic unroller and partial unroller. This seems conservative because
unlike dynamic/partial unrolling, fully unrolling will not affect
LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to
double the threshold for loop fully unroller. This will change the codegen
of several SPECCPU benchmarks:
Code size:
447.dealII 0.50%
453.povray 0.42%
433.milc 0.20%
445.gobmk 0.32%
403.gcc 0.05%
464.h264ref 3.62%
Compile Time:
447.dealII 0.22%
453.povray -0.16%
433.milc 0.09%
445.gobmk -2.43%
403.gcc 0.06%
464.h264ref 3.21%
Performance (on intel sandybridge):
447.dealII +0.07%
453.povray +1.79%
433.milc +1.02%
445.gobmk +0.56%
403.gcc -0.16%
464.h264ref -0.41%
Looks like the change has overall positive performance impact with very
small code size/compile time overhead. Now the question is shall we make
this change default in O2, or shall we leave it in O3. We would like to
have more input from the community to make the decision.
Thanks,
Dehao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170130/2fc84f24/attachment.html>
More information about the llvm-dev
mailing list