[llvm-commits] [PATCH] Teach SCEV about <= comparisons for trip count computation

Thu Aug 30 23:51:37 PDT 2012

Tobias von Koch wrote:
> Dear all,
>
> SCEV currently has a weakness in the loop trip count computation code
> when it comes to loops whose bounds are determined by a <= comparison.
>
> Take the example from the attached test case:
>
> int Proc_8 (int arg1)
> {
>    int result = 0;
>    for (int i = arg1; i <= arg1 + 5; ++i)
>      result++;
>    return result;
> }
>
> GCC is able to determine the trip count exactly and translates this
> whole function into a single instruction (+ return) on PPC.
>
> LLVM fails on two fronts:
> - First of all, it can't deal with the <= comparison because the code to
> handle this isn't there. It was added in an earlier version of SCEV
> years ago but then reverted alongside a lot of other (buggy) code. The
> attached patch fixes this.
> - Secondly, even once the patch has been applied, LLVM can only derive a
> smax() expression for the trip count of this loop. While this is a lot
> better already, it's still not an exact number. I get the impression
> that this has something to do with the 'nsw' flag on the original
> addition not being propagated through the SCEV analysis, but I'm not
> entirely sure here.

These two issues are related. You're right that we aren't getting the 
'nsw' information to where scev needs it, though we have to be very 
careful when doing that.

Because we aren't, your patch miscompiles this function:
   unsigned char test(unsigned char n) {
     unsigned char i;
     for (i = 0; i <= n; ++i) {
     }
     return i;
   }
where 'n' == 255, because the loop would be infinite and we make it 
return zero. (Yes this may be a valid transform for certain versions of 
C and C++ but it isn't valid in LLVM IR at this time.)

> The patch passes all regression tests and the nightly test suite. I do
> have one worry about the line marked with a 'FIXME', though. Is it
> necessary to check for overflow here? Can this be done using the same
> method as in ScalarEvolution::getBECount, i.e. zero-extend to larger
> type, add, compare results?

No, even a zero-extended computation would not correctly model the 
infinite loop.

I don't know how we ought to fix this.

Nick