[llvm] r193015 - SCEV should use NSW to get trip count for positive nonunit stride loops.

Fri Oct 18 17:35:58 PDT 2013

Hi Andy,

This looks like it broke the bot:

http://lab.llvm.org:8013/builders/clang-x86_64-darwin11-nobootstrap-RAincremental

On Oct 18, 2013, at 4:43 PM, Andrew Trick <atrick at apple.com> wrote:

> Author: atrick
> Date: Fri Oct 18 18:43:53 2013
> New Revision: 193015
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=193015&view=rev
> Log:
> SCEV should use NSW to get trip count for positive nonunit stride loops.
> 
> SCEV currently fails to compute loop counts for nonunit stride
> loops. This comes up frequently. It prevents loop optimization and
> forces vectorization to insert extra loop checks.
> 
> For example:
> void foo(int n, int *x) {
> for (int i = 0; i < n; i += 3) {
>   x[i] = i;
>   x[i+1] = i+1;
>   x[i+2] = i+2;
> }
> }
> 
> We need to properly handle the case in which limit > INT_MAX-stride. In
> the above case: n > INT_MAX-3. In this case the loop counter will step
> beyond the limit and overflow at the same time. However, knowing that
> signed integer overlow in undefined, we can assume the loop test
> behavior is arbitrary after overflow. This obeys both C undefined
> behavior rules, and the more strict LLVM poison value rules.
> 
> I'm finally fixing this in response to Hal Finkel's persistence.
> The most probable reason that we never optimized this before is that
> we were being careful to handle case where the developer expected a
> side-effect free infinite loop relying on overflow:
> 
> for (int i = 0; i < n; i += s) {
>  ++j;
> }
> return j;
> 
> If INT_MAX+1 is a multiple of s and n > INT_MAX-s, then we might
> expect an infinite loop. However there are plenty of ways to achieve
> this effect without relying on undefined behavior of signed overflow.
> 
> Modified:
>    llvm/trunk/lib/Analysis/ScalarEvolution.cpp
>    llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll
> 
> Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ScalarEvolution.cpp?rev=193015&r1=193014&r2=193015&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Analysis/ScalarEvolution.cpp (original)
> +++ llvm/trunk/lib/Analysis/ScalarEvolution.cpp Fri Oct 18 18:43:53 2013
> @@ -6398,13 +6398,6 @@ ScalarEvolution::HowManyLessThans(const
>   if (!AddRec || AddRec->getLoop() != L)
>     return getCouldNotCompute();
> 
> -  // Check to see if we have a flag which makes analysis easy.
> -  bool NoWrap = false;
> -  if (!IsSubExpr) {
> -    NoWrap = AddRec->getNoWrapFlags(
> -      (SCEV::NoWrapFlags)(((isSigned ? SCEV::FlagNSW : SCEV::FlagNUW))
> -                          | SCEV::FlagNW));
> -  }
>   if (AddRec->isAffine()) {
>     unsigned BitWidth = getTypeSizeInBits(AddRec->getType());
>     const SCEV *Step = AddRec->getStepRecurrence(*this);
> @@ -6414,20 +6407,21 @@ ScalarEvolution::HowManyLessThans(const
>     if (Step->isOne()) {
>       // With unit stride, the iteration never steps past the limit value.
>     } else if (isKnownPositive(Step)) {
> -      // Test whether a positive iteration can step past the limit
> -      // value and past the maximum value for its type in a single step.
> -      // Note that it's not sufficient to check NoWrap here, because even
> -      // though the value after a wrap is undefined, it's not undefined
> -      // behavior, so if wrap does occur, the loop could either terminate or
> -      // loop infinitely, but in either case, the loop is guaranteed to
> -      // iterate at least until the iteration where the wrapping occurs.
> +      // Test whether a positive iteration can step past the limit value and
> +      // past the maximum value for its type in a single step. The NSW/NUW flags
> +      // can imply that stepping past RHS would immediately result in undefined
> +      // behavior. No self-wrap is not useful here because the loop counter may
> +      // signed or unsigned wrap but continue iterating and terminate with
> +      // defined behavior without ever self-wrapping.
>       const SCEV *One = getConstant(Step->getType(), 1);
>       if (isSigned) {
> -        APInt Max = APInt::getSignedMaxValue(BitWidth);
> -        if ((Max - getSignedRange(getMinusSCEV(Step, One)).getSignedMax())
> +        if (!AddRec->getNoWrapFlags(SCEV::FlagNSW)) {
> +          APInt Max = APInt::getSignedMaxValue(BitWidth);
> +          if ((Max - getSignedRange(getMinusSCEV(Step, One)).getSignedMax())
>               .slt(getSignedRange(RHS).getSignedMax()))
> -          return getCouldNotCompute();
> -      } else {
> +            return getCouldNotCompute();
> +        }
> +      } else if (!AddRec->getNoWrapFlags(SCEV::FlagNUW)){
>         APInt Max = APInt::getMaxValue(BitWidth);
>         if ((Max - getUnsignedRange(getMinusSCEV(Step, One)).getUnsignedMax())
>               .ult(getUnsignedRange(RHS).getUnsignedMax()))
> @@ -6481,6 +6475,15 @@ ScalarEvolution::HowManyLessThans(const
>                   getMinusSCEV(getConstant(APInt::getMaxValue(BitWidth)),
>                                StepMinusOne));
> 
> +    // If the loop counter does not self-wrap, then the trip count may be
> +    // computed by dividing the distance by the step. This is independent of
> +    // signed or unsigned wrap.
> +    bool NoWrap = false;
> +    if (!IsSubExpr) {
> +      NoWrap = AddRec->getNoWrapFlags(
> +        (SCEV::NoWrapFlags)(((isSigned ? SCEV::FlagNSW : SCEV::FlagNUW))
> +                            | SCEV::FlagNW));
> +    }
>     // Finally, we subtract these two values and divide, rounding up, to get
>     // the number of times the backedge is executed.
>     const SCEV *BECount = getBECount(Start, End, Step, NoWrap);
> 
> Modified: llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll?rev=193015&r1=193014&r2=193015&view=diff
> ==============================================================================
> --- llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll (original)
> +++ llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll Fri Oct 18 18:43:53 2013
> @@ -25,8 +25,8 @@ exit:
> }
> 
> ; CHECK: Determining loop execution counts for: @step2
> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
> +; CHECK: Loop %loop: Unpredictable max backedge-taken count.
> define void @step2(i4 %n) {
> entry:
>   %s = icmp sgt i4 %n, 0
> @@ -57,8 +57,8 @@ exit:
> }
> 
> ; CHECK: Determining loop execution counts for: @start1_step2
> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
> +; CHECK: Loop %loop: Unpredictable max backedge-taken count.
> define void @start1_step2(i4 %n) {
> entry:
>   %s = icmp sgt i4 %n, 0
> @@ -89,8 +89,8 @@ exit:
> }
> 
> ; CHECK: Determining loop execution counts for: @startx_step2
> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
> +; CHECK: Loop %loop: Unpredictable max backedge-taken count.
> define void @startx_step2(i4 %n, i4 %x) {
> entry:
>   %s = icmp sgt i4 %n, 0
> @@ -120,12 +120,18 @@ exit:
>   ret void
> }
> 
> -; Be careful with this one. If %n is INT4_MAX, %i.next will wrap. The nsw bit
> -; says that the result is undefined, but ScalarEvolution must respect that
> -; subsequent passes may result the undefined behavior in predictable ways.
> +; If %n is INT4_MAX, %i.next will wrap. The nsw bit says that the
> +; result is undefined. Therefore, after the loop's second iteration,
> +; we are free to assume that the loop exits. This is valid because:
> +; (a) %i.next is a poison value after the second iteration, which can
> +; also be considered an undef value.
> +; (b) the return instruction enacts a side effect that is control
> +; dependent on the poison value.
> +;
> +; CHECK-LABEL: nsw_step2
> ; CHECK: Determining loop execution counts for: @nsw_step2
> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
> +; CHECK: Loop %loop: backedge-taken count is ((-1 + %n) /u 2)
> +; CHECK: Loop %loop: max backedge-taken count is 2
> define void @nsw_step2(i4 %n) {
> entry:
>   %s = icmp sgt i4 %n, 0
> @@ -139,6 +145,7 @@ exit:
>   ret void
> }
> 
> +; CHECK-LABEL: nsw_start1
> ; CHECK: Determining loop execution counts for: @nsw_start1
> ; CHECK: Loop %loop: backedge-taken count is (-2 + (2 smax %n))
> ; CHECK: Loop %loop: max backedge-taken count is 5
> @@ -156,8 +163,8 @@ exit:
> }
> 
> ; CHECK: Determining loop execution counts for: @nsw_start1_step2
> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
> +; CHECK: Loop %loop: backedge-taken count is ((-2 + (3 smax %n)) /u 2)
> +; CHECK: Loop %loop: max backedge-taken count is 2
> define void @nsw_start1_step2(i4 %n) {
> entry:
>   %s = icmp sgt i4 %n, 0
> @@ -188,8 +195,8 @@ exit:
> }
> 
> ; CHECK: Determining loop execution counts for: @nsw_startx_step2
> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
> -; CHECK: Loop %loop: Unpredictable max backedge-taken count. 
> +; CHECK: Loop %loop: backedge-taken count is ((-1 + (-1 * %x) + ((2 + %x) smax %n)) /u 2)
> +; CHECK: Loop %loop: max backedge-taken count is 7
> define void @nsw_startx_step2(i4 %n, i4 %x) {
> entry:
>   %s = icmp sgt i4 %n, 0
> @@ -221,7 +228,7 @@ exit:
> }
> 
> ; CHECK: Determining loop execution counts for: @even_step2
> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
> ; CHECK: Loop %loop: max backedge-taken count is 2
> define void @even_step2(i4 %n) {
> entry:
> @@ -255,7 +262,7 @@ exit:
> }
> 
> ; CHECK: Determining loop execution counts for: @even_start1_step2
> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
> ; CHECK: Loop %loop: max backedge-taken count is 2
> define void @even_start1_step2(i4 %n) {
> entry:
> @@ -289,7 +296,7 @@ exit:
> }
> 
> ; CHECK: Determining loop execution counts for: @even_startx_step2
> -; CHECK: Loop %loop: Unpredictable backedge-taken count. 
> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
> ; CHECK: Loop %loop: max backedge-taken count is 7
> define void @even_startx_step2(i4 %n, i4 %x) {
> entry:
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits