[PATCH] [Runtime Unrolling] use a loop to simplify the runtime unrolling prologue.

James Molloy james at jamesmolloy.co.uk
Tue Sep 2 05:26:09 PDT 2014


Hi Kevin,

[re-cc'ing list]

I see - this makes sense. Thanks for the explanation.

Cheers,

James


On 2 September 2014 13:09, Kevin Qin <kevinqindev at gmail.com> wrote:

> Hi James,
>
> Yes, For the original solution, switching to use a lookup/jump table may
> bring better performance, but doesn't help to recude the code size. And the
> performance benefit should not be significant, because most benefit from
> runtime unrolling are coming from the unrolled loop body, not from the
> prolog. Andthe unrolled prolog will cost extra10% code size comparing to
> the rolled one , which is not a good deal from my eyes.
>
>
> Regards,
> Kevin
>
>
> 2014-09-02 12:48 GMT+01:00 James Molloy <james at jamesmolloy.co.uk>:
>
> Hi Kevin,
>>
>> The "obvious" (to me at least) prologue would be to use something similar
>> to Duff's Device:
>>
>> extraiters = tripcount % loopfactor
>> switch (extraiters) {
>> case 0: jump loop:
>> case 1: jump L1
>> case 2: jump L2
>> case 3: jump L3
>> }
>>
>> Loop:
>>   tripcount --;
>>   LoopBody
>> L1:
>>   tripcount --;
>>   LoopBody
>> L2:
>>   tripcount --;
>>   LoopBody
>> L3:
>>   tripcount --;
>>   LoopBody
>>
>>   if (tripcount >= 0) jump Loop else jump Out
>>
>> Out:
>>
>> The switch would be changed into a lookup/jump table. Wouldn't this
>> produce better code too?
>>
>> Cheers,
>>
>> James
>>
>>
>> On 2 September 2014 10:56, Kevin Qin <kevinqindev at gmail.com> wrote:
>>
>>> Runtime unrolling will create a prologue to execute the extra iterations
>>> which is can't divided by the unroll factor. It generates an if-then-else
>>> sequence to jump into a factor -1 times unrolled loop body, like
>>>
>>>     extraiters = tripcount % loopfactor
>>>     if (extraiters == 0) jump Loop:
>>>     if (extraiters == loopfactor) jump L1
>>>     if (extraiters == loopfactor-1) jump L2
>>>     ...
>>>     L1:  LoopBody;
>>>     L2:  LoopBody;
>>>     ...
>>>     if tripcount < loopfactor jump End
>>>     Loop:
>>>     ...
>>>     End:
>>>
>>> It means if the unroll factor is 4, the loop body will be 7 times
>>> unrolled, 3 are in loop prologue, and 4 are in the loop.
>>> This patch is to use a loop to execute the extra iterations in prologue,
>>> like
>>>
>>>         extraiters = tripcount % loopfactor
>>>         if (extraiters == 0) jump Loop:
>>>         else jump Prol
>>>  Prol:  LoopBody;
>>>         extraiters -= 1                 // Omitted if unroll factor is 2.
>>>         if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2.
>>>         if (tripcount < loopfactor) jump End
>>>  Loop:
>>>  ...
>>>  End:
>>>
>>> Then when unroll factor is 4, the loop body will be copied by only 5
>>> times, 1 in the prologue loop, 4 in the original loop. And if the unroll
>>> factor is 2, new loop won't be created, just as the original solution.
>>>
>>> On AArch64 target, if  runtime unrolling enabled, after applying this
>>> patch, the code size will drop by 10%.
>>>
>>> Also, the sequence of if-then-else sequence is saved, which could bring
>>> very slightly performance benefit, which is less than 0.1% on X86 and
>>> AArch64 target.
>>>
>>> So overall, this patch can bring a lot of code size improvement, and
>>> have no harm to performance.
>>>
>>> Is it OK to commit?
>>>
>>> Thanks,
>>> Kevin
>>>
>>> http://reviews.llvm.org/D5147
>>>
>>> Files:
>>>   lib/Transforms/Utils/LoopUnrollRuntime.cpp
>>>   test/Transforms/LoopUnroll/PowerPC/a2-unrolling.ll
>>>   test/Transforms/LoopUnroll/runtime-loop.ll
>>>   test/Transforms/LoopUnroll/runtime-loop1.ll
>>>   test/Transforms/LoopUnroll/runtime-loop2.ll
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>>
>>
>
>
> --
> Best Regards,
>
> Kevin Qin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140902/428df1e1/attachment.html>


More information about the llvm-commits mailing list