[llvm] r193015 - SCEV should use NSW to get trip count for positive nonunit stride loops.

Fri Oct 18 17:37:15 PDT 2013

Sorry, I forgot to update PowerPC tests. It should have been fixed here r193021.
-Andy

On Oct 18, 2013, at 5:35 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com> wrote:

> Hi Andy,
> 
> This looks like it broke the bot:
> 
> http://lab.llvm.org:8013/builders/clang-x86_64-darwin11-nobootstrap-RAincremental
> 
> On Oct 18, 2013, at 4:43 PM, Andrew Trick <atrick at apple.com> wrote:
> 
>> Author: atrick
>> Date: Fri Oct 18 18:43:53 2013
>> New Revision: 193015
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=193015&view=rev
>> Log:
>> SCEV should use NSW to get trip count for positive nonunit stride loops.
>> 
>> SCEV currently fails to compute loop counts for nonunit stride
>> loops. This comes up frequently. It prevents loop optimization and
>> forces vectorization to insert extra loop checks.
>> 
>> For example:
>> void foo(int n, int *x) {
>> for (int i = 0; i < n; i += 3) {
>>  x[i] = i;
>>  x[i+1] = i+1;
>>  x[i+2] = i+2;
>> }
>> }
>> 
>> We need to properly handle the case in which limit > INT_MAX-stride. In
>> the above case: n > INT_MAX-3. In this case the loop counter will step
>> beyond the limit and overflow at the same time. However, knowing that
>> signed integer overlow in undefined, we can assume the loop test
>> behavior is arbitrary after overflow. This obeys both C undefined
>> behavior rules, and the more strict LLVM poison value rules.
>> 
>> I'm finally fixing this in response to Hal Finkel's persistence.
>> The most probable reason that we never optimized this before is that
>> we were being careful to handle case where the developer expected a
>> side-effect free infinite loop relying on overflow:
>> 
>> for (int i = 0; i < n; i += s) {
>> ++j;
>> }
>> return j;
>> 
>> If INT_MAX+1 is a multiple of s and n > INT_MAX-s, then we might
>> expect an infinite loop. However there are plenty of ways to achieve
>> this effect without relying on undefined behavior of signed overflow.
>> 
>> Modified:
>>   llvm/trunk/lib/Analysis/ScalarEvolution.cpp
>>   llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll
>> 
>> Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ScalarEvolution.cpp?rev=193015&r1=193014&r2=193015&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Analysis/ScalarEvolution.cpp (original)
>> +++ llvm/trunk/lib/Analysis/ScalarEvolution.cpp Fri Oct 18 18:43:53 2013
>> @@ -6398,13 +6398,6 @@ ScalarEvolution::HowManyLessThans(const
>>  if (!AddRec || AddRec->getLoop() != L)
>>    return getCouldNotCompute();
>> 
>> -  // Check to see if we have a flag which makes analysis easy.
>> -  bool NoWrap = false;
>> -  if (!IsSubExpr) {
>> -    NoWrap = AddRec->getNoWrapFlags(
>> -      (SCEV::NoWrapFlags)(((isSigned ? SCEV::FlagNSW : SCEV::FlagNUW))
>> -                          | SCEV::FlagNW));
>> -  }
>>  if (AddRec->isAffine()) {
>>    unsigned BitWidth = getTypeSizeInBits(AddRec->getType());
>>    const SCEV *Step = AddRec->getStepRecurrence(*this);
>> @@ -6414,20 +6407,21 @@ ScalarEvolution::HowManyLessThans(const
>>    if (Step->isOne()) {
>>      // With unit stride, the iteration never steps past the limit value.
>>    } else if (isKnownPositive(Step)) {
>> -      // Test whether a positive iteration can step past the limit
>> -      // value and past the maximum value for its type in a single step.
>> -      // Note that it's not sufficient to check NoWrap here, because even
>> -      // though the value after a wrap is undefined, it's not undefined
>> -      // behavior, so if wrap does occur, the loop could either terminate or
>> -      // loop infinitely, but in either case, the loop is guaranteed to
>> -      // iterate at least until the iteration where the wrapping occurs.
>> +      // Test whether a positive iteration can step past the limit value and
>> +      // past the maximum value for its type in a single step. The NSW/NUW flags
>> +      // can imply that stepping past RHS would immediately result in undefined
>> +      // behavior. No self-wrap is not useful here because the loop counter may
>> +      // signed or unsigned wrap but continue iterating and terminate with
>> +      // defined behavior without ever self-wrapping.
>>      const SCEV *One = getConstant(Step->getType(), 1);
>>      if (isSigned) {
>> -        APInt Max = APInt::getSignedMaxValue(BitWidth);
>> -        if ((Max - getSignedRange(getMinusSCEV(Step, One)).getSignedMax())
>> +        if (!AddRec->getNoWrapFlags(SCEV::FlagNSW)) {
>> +          APInt Max = APInt::getSignedMaxValue(BitWidth);
>> +          if ((Max - getSignedRange(getMinusSCEV(Step, One)).getSignedMax())
>>              .slt(getSignedRange(RHS).getSignedMax()))
>> -          return getCouldNotCompute();
>> -      } else {
>> +            return getCouldNotCompute();
>> +        }
>> +      } else if (!AddRec->getNoWrapFlags(SCEV::FlagNUW)){
>>        APInt Max = APInt::getMaxValue(BitWidth);
>>        if ((Max - getUnsignedRange(getMinusSCEV(Step, One)).getUnsignedMax())
>>              .ult(getUnsignedRange(RHS).getUnsignedMax()))
>> @@ -6481,6 +6475,15 @@ ScalarEvolution::HowManyLessThans(const
>>                  getMinusSCEV(getConstant(APInt::getMaxValue(BitWidth)),
>>                               StepMinusOne));
>> 
>> +    // If the loop counter does not self-wrap, then the trip count may be
>> +    // computed by dividing the distance by the step. This is independent of
>> +    // signed or unsigned wrap.
>> +    bool NoWrap = false;
>> +    if (!IsSubExpr) {
>> +      NoWrap = AddRec->getNoWrapFlags(
>> +        (SCEV::NoWrapFlags)(((isSigned ? SCEV::FlagNSW : SCEV::FlagNUW))
>> +                            | SCEV::FlagNW));
>> +    }
>>    // Finally, we subtract these two values and divide, rounding up, to get
>>    // the number of times the backedge is executed.
>>    const SCEV *BECount = getBECount(Start, End, Step, NoWrap);
>> 
>> Modified: llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll?rev=193015&r1=193014&r2=193015&view=diff
>> ==============================================================================
>> --- llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll (original)
>> +++ llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll Fri Oct 18 18:43:53 2013
>> @@ -25,8 +25,8 @@ exit:
>> }
>> 
>> ; CHECK: Determining loop execution counts for: @step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> define void @step2(i4 %n) {
>> entry:
>>  %s = icmp sgt i4 %n, 0
>> @@ -57,8 +57,8 @@ exit:
>> }
>> 
>> ; CHECK: Determining loop execution counts for: @start1_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> define void @start1_step2(i4 %n) {
>> entry:
>>  %s = icmp sgt i4 %n, 0
>> @@ -89,8 +89,8 @@ exit:
>> }
>> 
>> ; CHECK: Determining loop execution counts for: @startx_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> define void @startx_step2(i4 %n, i4 %x) {
>> entry:
>>  %s = icmp sgt i4 %n, 0
>> @@ -120,12 +120,18 @@ exit:
>>  ret void
>> }
>> 
>> -; Be careful with this one. If %n is INT4_MAX, %i.next will wrap. The nsw bit
>> -; says that the result is undefined, but ScalarEvolution must respect that
>> -; subsequent passes may result the undefined behavior in predictable ways.
>> +; If %n is INT4_MAX, %i.next will wrap. The nsw bit says that the
>> +; result is undefined. Therefore, after the loop's second iteration,
>> +; we are free to assume that the loop exits. This is valid because:
>> +; (a) %i.next is a poison value after the second iteration, which can
>> +; also be considered an undef value.
>> +; (b) the return instruction enacts a side effect that is control
>> +; dependent on the poison value.
>> +;
>> +; CHECK-LABEL: nsw_step2
>> ; CHECK: Determining loop execution counts for: @nsw_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
>> +; CHECK: Loop %loop: backedge-taken count is ((-1 + %n) /u 2)
>> +; CHECK: Loop %loop: max backedge-taken count is 2
>> define void @nsw_step2(i4 %n) {
>> entry:
>>  %s = icmp sgt i4 %n, 0
>> @@ -139,6 +145,7 @@ exit:
>>  ret void
>> }
>> 
>> +; CHECK-LABEL: nsw_start1
>> ; CHECK: Determining loop execution counts for: @nsw_start1
>> ; CHECK: Loop %loop: backedge-taken count is (-2 + (2 smax %n))
>> ; CHECK: Loop %loop: max backedge-taken count is 5
>> @@ -156,8 +163,8 @@ exit:
>> }
>> 
>> ; CHECK: Determining loop execution counts for: @nsw_start1_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
>> +; CHECK: Loop %loop: backedge-taken count is ((-2 + (3 smax %n)) /u 2)
>> +; CHECK: Loop %loop: max backedge-taken count is 2
>> define void @nsw_start1_step2(i4 %n) {
>> entry:
>>  %s = icmp sgt i4 %n, 0
>> @@ -188,8 +195,8 @@ exit:
>> }
>> 
>> ; CHECK: Determining loop execution counts for: @nsw_startx_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
>> +; CHECK: Loop %loop: backedge-taken count is ((-1 + (-1 * %x) + ((2 + %x) smax %n)) /u 2)
>> +; CHECK: Loop %loop: max backedge-taken count is 7
>> define void @nsw_startx_step2(i4 %n, i4 %x) {
>> entry:
>>  %s = icmp sgt i4 %n, 0
>> @@ -221,7 +228,7 @@ exit:
>> }
>> 
>> ; CHECK: Determining loop execution counts for: @even_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> ; CHECK: Loop %loop: max backedge-taken count is 2
>> define void @even_step2(i4 %n) {
>> entry:
>> @@ -255,7 +262,7 @@ exit:
>> }
>> 
>> ; CHECK: Determining loop execution counts for: @even_start1_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> ; CHECK: Loop %loop: max backedge-taken count is 2
>> define void @even_start1_step2(i4 %n) {
>> entry:
>> @@ -289,7 +296,7 @@ exit:
>> }
>> 
>> ; CHECK: Determining loop execution counts for: @even_startx_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> ; CHECK: Loop %loop: max backedge-taken count is 7
>> define void @even_startx_step2(i4 %n, i4 %x) {
>> entry:
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>