[PATCH][LoopVectorizer] Restrict the unroll factor of reductions in loops
James Molloy
james.molloy at arm.com
Fri Aug 8 07:37:38 PDT 2014
Hi Arnold,
Attached are two patches. The first ups the maximum unroll factor on AArch64
from 2 to 4, for C-A57 only at the moment as that's all I've got data for.
This gives us significant wins - ~14% on 462.libquantum at least.
However it also causes some regressions. The second patch makes the loop
vectorizer a bit more conservative with its unroll factor. The problem is
purely for reductions within loops. The regressions I've seen are small (but
runtime-known) trip count loops within a loop nest. A loop unroll factor of
2 is fine, but above 2 the reduction variable fixup logic after the loop
increases the critical path length and resource usage. For most loops this
isn't a problem, but small loops in a larger loop nest will execute this
fixup code many times.
The heuristic is: if this is a (scalar) reduction, and the loop is nested,
clamp the UF to a maximum of 2. With 2, we still get wins but we only add
one fadd/fmul to the critical path.
Please take a look.
Cheers,
James
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140808/40b66ed1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: up-max-unroll.diff
Type: application/octet-stream
Size: 679 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140808/40b66ed1/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: limit-scalar-reductions.diff
Type: application/octet-stream
Size: 1126 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140808/40b66ed1/attachment-0001.obj>
More information about the llvm-commits
mailing list