On 30 August 2012 23:51, Nick Lewycky <span dir="ltr"><<a href="mailto:nicholas@mxc.ca" target="_blank">nicholas@mxc.ca</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div class="im">Tobias von Koch wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Dear all,<br>

<br>

SCEV currently has a weakness in the loop trip count computation code<br>

when it comes to loops whose bounds are determined by a <= comparison.<br>

<br>

Take the example from the attached test case:<br>

<br>

int Proc_8 (int arg1)<br>

{<br>

   int result = 0;<br>

   for (int i = arg1; i <= arg1 + 5; ++i)<br>

     result++;<br>

   return result;<br>

}<br>

<br>

GCC is able to determine the trip count exactly and translates this<br>

whole function into a single instruction (+ return) on PPC.<br>

<br>

LLVM fails on two fronts:<br>

- First of all, it can't deal with the <= comparison because the code to<br>

handle this isn't there. It was added in an earlier version of SCEV<br>

years ago but then reverted alongside a lot of other (buggy) code. The<br>

attached patch fixes this.<br>

- Secondly, even once the patch has been applied, LLVM can only derive a<br>

smax() expression for the trip count of this loop. While this is a lot<br>

better already, it's still not an exact number. I get the impression<br>

that this has something to do with the 'nsw' flag on the original<br>

addition not being propagated through the SCEV analysis, but I'm not<br>

entirely sure here.<br>

</blockquote>

<br></div>

These two issues are related. You're right that we aren't getting the 'nsw' information to where scev needs it, though we have to be very careful when doing that.<br>

<br>

Because we aren't, your patch miscompiles this function:<br>

  unsigned char test(unsigned char n) {<br>

    unsigned char i;<br>

    for (i = 0; i <= n; ++i) {<br>

    }<br>

    return i;<br>

  }<br>

where 'n' == 255, because the loop would be infinite and we make it return zero. (Yes this may be a valid transform for certain versions of C and C++ but it isn't valid in LLVM IR at this time.)<div class="im">


<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

The patch passes all regression tests and the nightly test suite. I do<br>

have one worry about the line marked with a 'FIXME', though. Is it<br>

necessary to check for overflow here? Can this be done using the same<br>

method as in ScalarEvolution::getBECount, i.e. zero-extend to larger<br>

type, add, compare results?<br>

</blockquote>

<br></div>

No, even a zero-extended computation would not correctly model the infinite loop.<br>

<br>

I don't know how we ought to fix this.<br></blockquote><div><br></div><div>Some thoughts! Consider the loop:</div><div><br></div><div>  for (int i = 0; i <= n; ++i) {</div><div>    x += 7;</div><div>  }</div><div>


  use(x);</div><div><br></div><div>What's the trip count? Right now it's CouldNotCompute, and that's the ideal answer given the SCEV expressions that exist today. What is it mathematically, ignoring no-signed-wrapping? It's an infinite loop if n == INT_MAX otherwise the trip count is n. Note that it isn't undefined when n == INT_MAX, it's defined to be an infinite loop (we have a language-specific pass to eliminate such infinite loops with no side-effects, but we won't do it here).</div>


<div><br></div><div>Continuing to keep the no-signed-wrapping aside, what about x? To start, exit-value-of(x) = starting-value-of(x) + 7 * (infinite loop if n == INT_MAX, otherwise n), but that expression simplifies. Implicit in the question "exit-value-of(x)" is the assumption that the loop wasn't infinite after all, so it's exit-value-of(x) = starting-value-of(x) + 7 * n.</div>


<div><br></div><div>So it seems like we don't really need nsw here, we could simplify this away by adding a new SCEV expression that is conditionally an infinite loop. We could teach the SCEV expander to emit an infinite loop and a branch to it. It's not clear to me how this works though -- what does it mean to have "X + 7 * (infinite loop)" in a scev expression? Consider the next loop:</div>


<div><br></div><div><div>  for (int i = 0; i <= n; ++i) {</div><div>    x += 7;</div><div>    for (j = i; j <= n; ++j) {<br>      y += x + 1;</div><div>    }</div><div>    use(y);</div><div>  }</div><div><br></div>

</div>

<div>How do we evaluate y at the point it's used? It hasn't lost its "potentially-unreachable"ness yet.</div><div><br></div><div>And a few more questions, like how does this expression simplify? Can we always hoist out the potentially-infinite part to the outer-most level? Do folks think this new expression is sensible?</div>


<div><br></div><div>Nick</div></div>