[llvm-commits] [PATCH] Teach SCEV about <= comparisons for trip count computation

Wed Sep 19 19:47:22 PDT 2012

On 30 August 2012 23:51, Nick Lewycky <nicholas at mxc.ca> wrote:

> Tobias von Koch wrote:
>
>> Dear all,
>>
>> SCEV currently has a weakness in the loop trip count computation code
>> when it comes to loops whose bounds are determined by a <= comparison.
>>
>> Take the example from the attached test case:
>>
>> int Proc_8 (int arg1)
>> {
>>    int result = 0;
>>    for (int i = arg1; i <= arg1 + 5; ++i)
>>      result++;
>>    return result;
>> }
>>
>> GCC is able to determine the trip count exactly and translates this
>> whole function into a single instruction (+ return) on PPC.
>>
>> LLVM fails on two fronts:
>> - First of all, it can't deal with the <= comparison because the code to
>> handle this isn't there. It was added in an earlier version of SCEV
>> years ago but then reverted alongside a lot of other (buggy) code. The
>> attached patch fixes this.
>> - Secondly, even once the patch has been applied, LLVM can only derive a
>> smax() expression for the trip count of this loop. While this is a lot
>> better already, it's still not an exact number. I get the impression
>> that this has something to do with the 'nsw' flag on the original
>> addition not being propagated through the SCEV analysis, but I'm not
>> entirely sure here.
>>
>
> These two issues are related. You're right that we aren't getting the
> 'nsw' information to where scev needs it, though we have to be very careful
> when doing that.
>
> Because we aren't, your patch miscompiles this function:
>   unsigned char test(unsigned char n) {
>     unsigned char i;
>     for (i = 0; i <= n; ++i) {
>     }
>     return i;
>   }
> where 'n' == 255, because the loop would be infinite and we make it return
> zero. (Yes this may be a valid transform for certain versions of C and C++
> but it isn't valid in LLVM IR at this time.)
>
>
>  The patch passes all regression tests and the nightly test suite. I do
>> have one worry about the line marked with a 'FIXME', though. Is it
>> necessary to check for overflow here? Can this be done using the same
>> method as in ScalarEvolution::getBECount, i.e. zero-extend to larger
>> type, add, compare results?
>>
>
> No, even a zero-extended computation would not correctly model the
> infinite loop.
>
> I don't know how we ought to fix this.
>

Some thoughts! Consider the loop:

  for (int i = 0; i <= n; ++i) {
    x += 7;
  }
  use(x);

What's the trip count? Right now it's CouldNotCompute, and that's the ideal
answer given the SCEV expressions that exist today. What is it
mathematically, ignoring no-signed-wrapping? It's an infinite loop if n ==
INT_MAX otherwise the trip count is n. Note that it isn't undefined when n
== INT_MAX, it's defined to be an infinite loop (we have a
language-specific pass to eliminate such infinite loops with no
side-effects, but we won't do it here).

Continuing to keep the no-signed-wrapping aside, what about x? To start,
exit-value-of(x) = starting-value-of(x) + 7 * (infinite loop if n ==
INT_MAX, otherwise n), but that expression simplifies. Implicit in the
question "exit-value-of(x)" is the assumption that the loop wasn't infinite
after all, so it's exit-value-of(x) = starting-value-of(x) + 7 * n.

So it seems like we don't really need nsw here, we could simplify this away
by adding a new SCEV expression that is conditionally an infinite loop. We
could teach the SCEV expander to emit an infinite loop and a branch to it.
It's not clear to me how this works though -- what does it mean to have "X
+ 7 * (infinite loop)" in a scev expression? Consider the next loop:

  for (int i = 0; i <= n; ++i) {
    x += 7;
    for (j = i; j <= n; ++j) {
      y += x + 1;
    }
    use(y);
  }

How do we evaluate y at the point it's used? It hasn't lost its
"potentially-unreachable"ness yet.

And a few more questions, like how does this expression simplify? Can we
always hoist out the potentially-infinite part to the outer-most level? Do
folks think this new expression is sensible?

Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120919/93c612d0/attachment.html>