[PATCH] D34619: [ARM] Enable partial and runtime unrolling

Sam Parker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 6 03:07:03 PDT 2017


samparker added a comment.

Hi Eli,

Your comments make sense to me, so I ran an example to figure out if this heuristic was indeed nonsense. Here's the example kernel:

  for (unsigned i = 0; i < max; ++i) {
    acc = 0;
    innerMax = dataSize - i;
    for (unsigned j = 0; j < innerMax; ++j) {
      acc += (input[j] * input[i+j]) >> scaleValue;
    }

The results in the graph show that often the unrolled version is faster, but the net affect across the data set is that unrolling is detrimental on performance. My other benchmark results also show that having this restriction doesn't negatively impact the performance, so I think that including the heuristic to prevent unrolling is valid. 
F3628212: unrolling.png <https://reviews.llvm.org/F3628212>

cheers,
sam


https://reviews.llvm.org/D34619





More information about the llvm-commits mailing list