[PATCH] Optimize unrolled reductions in LoopStrengthReduce

Olivier Sallenave ohsallen at us.ibm.com
Fri Feb 6 11:33:01 PST 2015

Here is a simpler solution: when the inner loop contains reductions and gets unrolled, the loop vectorizer should unroll the outer loop and break dependencies. For the code below, it does not happen because the loop isn't considered 'small' anymore. Attached is a patch which changes the heuristics in the vectorizer unroller, and gives a 2x speedup for this code on POWER8. If it LGTY, I will add it as a regression test.

  for(int i=0; i<n; i++) {
    for(int i_c=0; i_c<3; i_c++) {
      _Complex __attribute__ ((aligned (8))) at = a[i][i_c];
      sum += ((__real__(at))*(__real__(at)) + (__imag__(at))*(__imag__(at)));

F380054: patch.diff <http://reviews.llvm.org/F380054>



More information about the llvm-commits mailing list