[llvm] r193015 - SCEV should use NSW to get trip count for positive nonunit stride loops.
Andrew Trick
atrick at apple.com
Fri Oct 18 17:37:15 PDT 2013
Sorry, I forgot to update PowerPC tests. It should have been fixed here r193021.
-Andy
On Oct 18, 2013, at 5:35 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com> wrote:
> Hi Andy,
>
> This looks like it broke the bot:
>
> http://lab.llvm.org:8013/builders/clang-x86_64-darwin11-nobootstrap-RAincremental
>
> On Oct 18, 2013, at 4:43 PM, Andrew Trick <atrick at apple.com> wrote:
>
>> Author: atrick
>> Date: Fri Oct 18 18:43:53 2013
>> New Revision: 193015
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=193015&view=rev
>> Log:
>> SCEV should use NSW to get trip count for positive nonunit stride loops.
>>
>> SCEV currently fails to compute loop counts for nonunit stride
>> loops. This comes up frequently. It prevents loop optimization and
>> forces vectorization to insert extra loop checks.
>>
>> For example:
>> void foo(int n, int *x) {
>> for (int i = 0; i < n; i += 3) {
>> x[i] = i;
>> x[i+1] = i+1;
>> x[i+2] = i+2;
>> }
>> }
>>
>> We need to properly handle the case in which limit > INT_MAX-stride. In
>> the above case: n > INT_MAX-3. In this case the loop counter will step
>> beyond the limit and overflow at the same time. However, knowing that
>> signed integer overlow in undefined, we can assume the loop test
>> behavior is arbitrary after overflow. This obeys both C undefined
>> behavior rules, and the more strict LLVM poison value rules.
>>
>> I'm finally fixing this in response to Hal Finkel's persistence.
>> The most probable reason that we never optimized this before is that
>> we were being careful to handle case where the developer expected a
>> side-effect free infinite loop relying on overflow:
>>
>> for (int i = 0; i < n; i += s) {
>> ++j;
>> }
>> return j;
>>
>> If INT_MAX+1 is a multiple of s and n > INT_MAX-s, then we might
>> expect an infinite loop. However there are plenty of ways to achieve
>> this effect without relying on undefined behavior of signed overflow.
>>
>> Modified:
>> llvm/trunk/lib/Analysis/ScalarEvolution.cpp
>> llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll
>>
>> Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ScalarEvolution.cpp?rev=193015&r1=193014&r2=193015&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Analysis/ScalarEvolution.cpp (original)
>> +++ llvm/trunk/lib/Analysis/ScalarEvolution.cpp Fri Oct 18 18:43:53 2013
>> @@ -6398,13 +6398,6 @@ ScalarEvolution::HowManyLessThans(const
>> if (!AddRec || AddRec->getLoop() != L)
>> return getCouldNotCompute();
>>
>> - // Check to see if we have a flag which makes analysis easy.
>> - bool NoWrap = false;
>> - if (!IsSubExpr) {
>> - NoWrap = AddRec->getNoWrapFlags(
>> - (SCEV::NoWrapFlags)(((isSigned ? SCEV::FlagNSW : SCEV::FlagNUW))
>> - | SCEV::FlagNW));
>> - }
>> if (AddRec->isAffine()) {
>> unsigned BitWidth = getTypeSizeInBits(AddRec->getType());
>> const SCEV *Step = AddRec->getStepRecurrence(*this);
>> @@ -6414,20 +6407,21 @@ ScalarEvolution::HowManyLessThans(const
>> if (Step->isOne()) {
>> // With unit stride, the iteration never steps past the limit value.
>> } else if (isKnownPositive(Step)) {
>> - // Test whether a positive iteration can step past the limit
>> - // value and past the maximum value for its type in a single step.
>> - // Note that it's not sufficient to check NoWrap here, because even
>> - // though the value after a wrap is undefined, it's not undefined
>> - // behavior, so if wrap does occur, the loop could either terminate or
>> - // loop infinitely, but in either case, the loop is guaranteed to
>> - // iterate at least until the iteration where the wrapping occurs.
>> + // Test whether a positive iteration can step past the limit value and
>> + // past the maximum value for its type in a single step. The NSW/NUW flags
>> + // can imply that stepping past RHS would immediately result in undefined
>> + // behavior. No self-wrap is not useful here because the loop counter may
>> + // signed or unsigned wrap but continue iterating and terminate with
>> + // defined behavior without ever self-wrapping.
>> const SCEV *One = getConstant(Step->getType(), 1);
>> if (isSigned) {
>> - APInt Max = APInt::getSignedMaxValue(BitWidth);
>> - if ((Max - getSignedRange(getMinusSCEV(Step, One)).getSignedMax())
>> + if (!AddRec->getNoWrapFlags(SCEV::FlagNSW)) {
>> + APInt Max = APInt::getSignedMaxValue(BitWidth);
>> + if ((Max - getSignedRange(getMinusSCEV(Step, One)).getSignedMax())
>> .slt(getSignedRange(RHS).getSignedMax()))
>> - return getCouldNotCompute();
>> - } else {
>> + return getCouldNotCompute();
>> + }
>> + } else if (!AddRec->getNoWrapFlags(SCEV::FlagNUW)){
>> APInt Max = APInt::getMaxValue(BitWidth);
>> if ((Max - getUnsignedRange(getMinusSCEV(Step, One)).getUnsignedMax())
>> .ult(getUnsignedRange(RHS).getUnsignedMax()))
>> @@ -6481,6 +6475,15 @@ ScalarEvolution::HowManyLessThans(const
>> getMinusSCEV(getConstant(APInt::getMaxValue(BitWidth)),
>> StepMinusOne));
>>
>> + // If the loop counter does not self-wrap, then the trip count may be
>> + // computed by dividing the distance by the step. This is independent of
>> + // signed or unsigned wrap.
>> + bool NoWrap = false;
>> + if (!IsSubExpr) {
>> + NoWrap = AddRec->getNoWrapFlags(
>> + (SCEV::NoWrapFlags)(((isSigned ? SCEV::FlagNSW : SCEV::FlagNUW))
>> + | SCEV::FlagNW));
>> + }
>> // Finally, we subtract these two values and divide, rounding up, to get
>> // the number of times the backedge is executed.
>> const SCEV *BECount = getBECount(Start, End, Step, NoWrap);
>>
>> Modified: llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll?rev=193015&r1=193014&r2=193015&view=diff
>> ==============================================================================
>> --- llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll (original)
>> +++ llvm/trunk/test/Analysis/ScalarEvolution/trip-count9.ll Fri Oct 18 18:43:53 2013
>> @@ -25,8 +25,8 @@ exit:
>> }
>>
>> ; CHECK: Determining loop execution counts for: @step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> define void @step2(i4 %n) {
>> entry:
>> %s = icmp sgt i4 %n, 0
>> @@ -57,8 +57,8 @@ exit:
>> }
>>
>> ; CHECK: Determining loop execution counts for: @start1_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> define void @start1_step2(i4 %n) {
>> entry:
>> %s = icmp sgt i4 %n, 0
>> @@ -89,8 +89,8 @@ exit:
>> }
>>
>> ; CHECK: Determining loop execution counts for: @startx_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> define void @startx_step2(i4 %n, i4 %x) {
>> entry:
>> %s = icmp sgt i4 %n, 0
>> @@ -120,12 +120,18 @@ exit:
>> ret void
>> }
>>
>> -; Be careful with this one. If %n is INT4_MAX, %i.next will wrap. The nsw bit
>> -; says that the result is undefined, but ScalarEvolution must respect that
>> -; subsequent passes may result the undefined behavior in predictable ways.
>> +; If %n is INT4_MAX, %i.next will wrap. The nsw bit says that the
>> +; result is undefined. Therefore, after the loop's second iteration,
>> +; we are free to assume that the loop exits. This is valid because:
>> +; (a) %i.next is a poison value after the second iteration, which can
>> +; also be considered an undef value.
>> +; (b) the return instruction enacts a side effect that is control
>> +; dependent on the poison value.
>> +;
>> +; CHECK-LABEL: nsw_step2
>> ; CHECK: Determining loop execution counts for: @nsw_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> +; CHECK: Loop %loop: backedge-taken count is ((-1 + %n) /u 2)
>> +; CHECK: Loop %loop: max backedge-taken count is 2
>> define void @nsw_step2(i4 %n) {
>> entry:
>> %s = icmp sgt i4 %n, 0
>> @@ -139,6 +145,7 @@ exit:
>> ret void
>> }
>>
>> +; CHECK-LABEL: nsw_start1
>> ; CHECK: Determining loop execution counts for: @nsw_start1
>> ; CHECK: Loop %loop: backedge-taken count is (-2 + (2 smax %n))
>> ; CHECK: Loop %loop: max backedge-taken count is 5
>> @@ -156,8 +163,8 @@ exit:
>> }
>>
>> ; CHECK: Determining loop execution counts for: @nsw_start1_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> +; CHECK: Loop %loop: backedge-taken count is ((-2 + (3 smax %n)) /u 2)
>> +; CHECK: Loop %loop: max backedge-taken count is 2
>> define void @nsw_start1_step2(i4 %n) {
>> entry:
>> %s = icmp sgt i4 %n, 0
>> @@ -188,8 +195,8 @@ exit:
>> }
>>
>> ; CHECK: Determining loop execution counts for: @nsw_startx_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> -; CHECK: Loop %loop: Unpredictable max backedge-taken count.
>> +; CHECK: Loop %loop: backedge-taken count is ((-1 + (-1 * %x) + ((2 + %x) smax %n)) /u 2)
>> +; CHECK: Loop %loop: max backedge-taken count is 7
>> define void @nsw_startx_step2(i4 %n, i4 %x) {
>> entry:
>> %s = icmp sgt i4 %n, 0
>> @@ -221,7 +228,7 @@ exit:
>> }
>>
>> ; CHECK: Determining loop execution counts for: @even_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> ; CHECK: Loop %loop: max backedge-taken count is 2
>> define void @even_step2(i4 %n) {
>> entry:
>> @@ -255,7 +262,7 @@ exit:
>> }
>>
>> ; CHECK: Determining loop execution counts for: @even_start1_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> ; CHECK: Loop %loop: max backedge-taken count is 2
>> define void @even_start1_step2(i4 %n) {
>> entry:
>> @@ -289,7 +296,7 @@ exit:
>> }
>>
>> ; CHECK: Determining loop execution counts for: @even_startx_step2
>> -; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> +; CHECK: Loop %loop: Unpredictable backedge-taken count.
>> ; CHECK: Loop %loop: max backedge-taken count is 7
>> define void @even_startx_step2(i4 %n, i4 %x) {
>> entry:
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
More information about the llvm-commits
mailing list