[PATCH] Optimize unrolled reductions in LoopStrengthReduce
Olivier Sallenave
ohsallen at us.ibm.com
Fri Feb 6 11:33:01 PST 2015
Here is a simpler solution: when the inner loop contains reductions and gets unrolled, the loop vectorizer should unroll the outer loop and break dependencies. For the code below, it does not happen because the loop isn't considered 'small' anymore. Attached is a patch which changes the heuristics in the vectorizer unroller, and gives a 2x speedup for this code on POWER8. If it LGTY, I will add it as a regression test.
for(int i=0; i<n; i++) {
for(int i_c=0; i_c<3; i_c++) {
_Complex __attribute__ ((aligned (8))) at = a[i][i_c];
sum += ((__real__(at))*(__real__(at)) + (__imag__(at))*(__imag__(at)));
}
}
F380054: patch.diff <http://reviews.llvm.org/F380054>
http://reviews.llvm.org/D7128
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list