[llvm-dev] Loop Unrolling Fail in Simple Vectorized loop

Alexandre Isoard via llvm-dev llvm-dev at lists.llvm.org
Thu Oct 13 09:30:54 PDT 2016


If count > MAX_UINT-4 your loop loops indefinitely with an increment of 4,
I think.

On Thu, Oct 13, 2016 at 4:42 PM, Charith Mendis via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> So, I tried unrolling the following simple loop.
>
> int unroll(unsigned * a, unsigned * b, unsigned *c, unsigned count){
>
>   for(unsigned i=0; i<count; i++){
>
>     a[i] = b[i]*c[i-1];
>
>   }
>
>   return 0;
>
> }
>
>
> Then, the unroller is able to unroll it by 2 even though it doesn't know
> about the range of count. SCEV of backedge taken count is (-1 + %count)
>
>
> But, when I change the increment to 4, as in
>
>
> int unroll(unsigned * a, unsigned * b, unsigned *c, unsigned count){
>
>   for(unsigned i=0; i<count; i+=4){
>
>     a[i] = b[i]*c[i-1];
>
>   }
>
>   return 0;
>
> }
>
>
>  The unroller cannot compute the backedge taken count. Therefore, it seems
> like the problem is not with the range of "count", can't the unroller
> compute it as (- 1 + %count / 4)?
>
> On Wed, Oct 12, 2016 at 11:28 PM, Charith Mendis <
> char.mendis1989 at gmail.com> wrote:
>
>> Thanks for the explanation. But I am a little confused with the following
>> fact. Can't LLVM keep vectorizable_elements as a symbolic value and convert
>> the loop to say;
>>
>> for(unsigned i = 0; i < vectorizable_elements  ; i += 2){
>>     //main loop
>> }
>>
>> for(unsigned i=0 ; i < vectorizable_elements % 2; i++){
>>    //fix up
>> }
>>
>> Why does it have to reason about the range of vectorizable_elements? Even
>> if vectorizable_elements == SIZE_MAX the above decomposition would work?
>>
>> On Wed, Oct 12, 2016 at 8:25 PM, Friedman, Eli <efriedma at codeaurora.org>
>> wrote:
>>
>>> On 10/12/2016 4:35 PM, Charith Mendis via llvm-dev wrote:
>>>
>>> Hi all,
>>>
>>> Attached herewith is a simple vectorized function with loops performing
>>> a simple shuffle.
>>>
>>> I want all loops (inner and outer) to be unrolled by 2 and as such used
>>> -unroll-count=2
>>> The inner loops(with k as the induction variable and having constant
>>> trip counts) unroll fully, but the outer loop with (j) fails to unroll.
>>>
>>> The llvm code is also attached with inner loops fully unrolled.
>>>
>>> To inspect further, I added the following to the PassManagerBuilder.cpp
>>> to run some canonicalization routines and redo unrolling again. I have set
>>> partial unrolling on + have a huge threshold + allows expensive loop trip
>>> counts. Still it didn't unroll by 2.
>>>
>>> MPM.add(createLoopUnrollPass());
>>>
>>> MPM.add(createCFGSimplificationPass());
>>>
>>> MPM.add(createLoopSimplifyPass());
>>>
>>> MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1));
>>>
>>> MPM.add(createLCSSAPass());
>>>
>>> MPM.add(createIndVarSimplifyPass());        // Canonicalize indvars
>>>
>>> MPM.add(createLoopUnrollPass());
>>>
>>>
>>> Digging deeper I found, that it fails in UnrollRuntimeLoopRemainder
>>> function, where it is unable to calculate the BackEdge taken amount.
>>>
>>> Can anybody explain what is need to get the outer loop unrolled by 2? It
>>> would be a great help.
>>>
>>>
>>> Well, I can at least explain what is happening... runtime unrolling
>>> needs to be able to symbolically compute the trip count to avoid inserting
>>> a branch after every iteration.  SCEV isn't able to prove that your loop
>>> isn't an infinite loop (consider the case of vectorizable_elements==SIZE_MAX),
>>> therefore it can't compute the trip count.  Therefore, we don't unroll.
>>>
>>> There's a few different angles you could use to attack this: you could
>>> teach the unroller to unroll loops with an uncomputable trip count, or you
>>> can make the trip count of your loop computable somehow.  Changing the
>>> unroller is probably straightforward (see the recently committed r284044).
>>> Making the trip count computable is more complicated... it's probably
>>> possible to teach SCEV to reason about the overflow in the pointer
>>> computation, or maybe you could version the loop.
>>>
>>> -Eli
>>>
>>> --
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>
>>>
>>
>>
>> --
>> Kind regards,
>> Charith Mendis
>>
>> Graduate Student,
>> CSAIL,
>> Massachusetts Institute of Technology
>>
>
>
>
> --
> Kind regards,
> Charith Mendis
>
> Graduate Student,
> CSAIL,
> Massachusetts Institute of Technology
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>


-- 
*Alexandre Isoard*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161013/355c93cc/attachment.html>


More information about the llvm-dev mailing list