[PATCH] D34619: [ARM] Enable partial and runtime unrolling
Sam Parker via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 6 03:07:03 PDT 2017
samparker added a comment.
Hi Eli,
Your comments make sense to me, so I ran an example to figure out if this heuristic was indeed nonsense. Here's the example kernel:
for (unsigned i = 0; i < max; ++i) {
acc = 0;
innerMax = dataSize - i;
for (unsigned j = 0; j < innerMax; ++j) {
acc += (input[j] * input[i+j]) >> scaleValue;
}
The results in the graph show that often the unrolled version is faster, but the net affect across the data set is that unrolling is detrimental on performance. My other benchmark results also show that having this restriction doesn't negatively impact the performance, so I think that including the heuristic to prevent unrolling is valid.
F3628212: unrolling.png <https://reviews.llvm.org/F3628212>
cheers,
sam
https://reviews.llvm.org/D34619
More information about the llvm-commits
mailing list