[PATCH] [Runtime Unrolling] use a loop to simplify the runtime unrolling prologue.
Kevin Qin
kevinqindev at gmail.com
Tue Sep 2 02:56:03 PDT 2014
Runtime unrolling will create a prologue to execute the extra iterations which is can't divided by the unroll factor. It generates an if-then-else sequence to jump into a factor -1 times unrolled loop body, like
extraiters = tripcount % loopfactor
if (extraiters == 0) jump Loop:
if (extraiters == loopfactor) jump L1
if (extraiters == loopfactor-1) jump L2
...
L1: LoopBody;
L2: LoopBody;
...
if tripcount < loopfactor jump End
Loop:
...
End:
It means if the unroll factor is 4, the loop body will be 7 times unrolled, 3 are in loop prologue, and 4 are in the loop.
This patch is to use a loop to execute the extra iterations in prologue, like
extraiters = tripcount % loopfactor
if (extraiters == 0) jump Loop:
else jump Prol
Prol: LoopBody;
extraiters -= 1 // Omitted if unroll factor is 2.
if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2.
if (tripcount < loopfactor) jump End
Loop:
...
End:
Then when unroll factor is 4, the loop body will be copied by only 5 times, 1 in the prologue loop, 4 in the original loop. And if the unroll factor is 2, new loop won't be created, just as the original solution.
On AArch64 target, if runtime unrolling enabled, after applying this patch, the code size will drop by 10%.
Also, the sequence of if-then-else sequence is saved, which could bring very slightly performance benefit, which is less than 0.1% on X86 and AArch64 target.
So overall, this patch can bring a lot of code size improvement, and have no harm to performance.
Is it OK to commit?
Thanks,
Kevin
http://reviews.llvm.org/D5147
Files:
lib/Transforms/Utils/LoopUnrollRuntime.cpp
test/Transforms/LoopUnroll/PowerPC/a2-unrolling.ll
test/Transforms/LoopUnroll/runtime-loop.ll
test/Transforms/LoopUnroll/runtime-loop1.ll
test/Transforms/LoopUnroll/runtime-loop2.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D5147.13156.patch
Type: text/x-patch
Size: 17120 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140902/8b2e6a54/attachment.bin>
More information about the llvm-commits
mailing list