Bug 16358: simplify SCEVs with assumptions

Mon Nov 25 16:07:39 PST 2013

On Nov 25, 2013, at 10:56 AM, Sebastian Pop <spop at codeaurora.org> wrote:

> Hi,
> 
> I have started looking at SCEV simplify in the context of determining the number
> of iterations for this loop:
> 
> void foo(int n, int *A) {
>  for (short i = 0; i < n; i++)
>    A[i] = 0;
> }
> 
> where the niter computation is confused and answers "could not compute" because
> the SCEV representation of %i contains a zext expression.
> 
> Now supposing that we remove the zext in "Expr = zext(Op i16 to i32)" by
> replacing the zext with a cast(Op, i32) and this under the assumption that Expr
> has values in range:
>  ConstantRange Range = SE.getUnsignedRange(Expr);
> 
> With this simplified representation of %i, niter ends up with a number of
> iterations of the form "%n". This assumption will then impact the computation of
> other SCEVs, like the value of %i at the end of the loop, etc., and with
> ScalarEvolution's caching, the same assumptions will be used in other niter or
> scev analysis queries, making it almost impossible to say which SE results have
> used the assumptions and which are independent.
> 
> For this reason I was thinking that the set of assumptions should be part of the
> state of the ScalarEvolution, and thus users of the SCEV should either version
> the transformed code with the assumptions, or otherwise if versioning is not
> possible, clear out the caches of ScalarEvolution and restart the analysis under
> no assumptions.

I'm nervous about managing SCEV caches that depend on query context because it is so hard to test and reason about the correctness, and it’s hard to know if compile time will become unbounded. Do we really need to solve this problem?

If the trip count computation could directly analyze expressions of this form:

  %conv = sext i16 %inc to i32
  -->  (sext i16 {1,+,1}<%for.body> to i32)

And gather assumptions during the analysis, then SCEV simplification doesn't need to do the work. (In fact I think we should do less work in SCEV simplification).

Say we had a utility, e.g. ScalarEvolution::promoteIV, that would take a SCEV of the form:
 (sext i16 {1,+,1}<%for.body> to i32)

And return a new SCEV:
 {1,+,1}<%for.body>

Along with it's set of assumptions:
 "{1,+,1}<%for.body>" < 2**16

We can always factor code between getSignExtendExpr and promoteIV, but I'm only looking at a few lines of code to do this.

It up to the caller to prove the assumption as a loop precondition, undefined behavior inference, or whatever. In the case of trip count computation, HowManyLessThans would prove the assumptions that it can and report the rest to the client of the trip count query. Aggressive loop opts, like LoopVectorizer, can gather the assumptions from various queries and materialize the minimum number of constraints such that they all hold. SCEV could provide a utility to optimize the constraints.

-Andy

> 
> Sebastian
> 
> Arnold Schwaighofer wrote:
>> 
>> On Oct 11, 2013, at 4:17 PM, Sebastian Pop <spop at codeaurora.org> wrote:
>> 
>>> Arnold Schwaighofer wrote:
>>> 
>>>> Interesting, in such a framework - if I understand you correctly - whenever we
>>>> simplify an expression we would have to try several assumptions: a harder
>>> 
>>> I think SCEV folding could compute the assumptions needed to simplify the
>>> expression. Let's take the example from the bug report:
>>> http://llvm.org/bugs/show_bug.cgi?id=16358
>>> 
>>>> 8 * (zext i32 ({0,+,2}%<for_body>) to i64)+ %C_aligned
>>>> 8 * (zext i32 ({1,+,2}%<for_body>) to i64)+ %C_aligned
>>>> Without knowing that for the loop <for_body> the functions "{0,+,2}%<for_body>"
>>>> and "{1,+2}%<for_body>" don?t wrap, SCEV cannot remove the zext.
>>> 
>>> simplify would recursively reconstruct the SCEV, so it would first dive in the
>>> innermost expression, and the first assumption it would extract is from
>>> simplifying (zext i32 ({0,+,2}%<for_body>) to i64)
>>> 
>>> simplify(zext i32 ({0,+,2}%<for_body>) to i64) = {0,+,2}%<for_body> assuming
>>> {0,+,2}%<for_body> does not wrap, i.e., 2*N < 2**32
>>> 
>>> simplify would produce two constraints:
>>> 
>>> 2*N < 2**32
>>> 2*N+1 < 2**32
>>> 
>>> we should keep the one satisfying both simplified expressions:
>>> 2*N+1 < 2**32
>> 
>> 
>> Okay, I see now. I was looking at the problem top down - hence a tree of
>> decisions :) - and was worried that for some examples we would have to guess a
>> value. But bottom up we should have already seen such values. Right.
> 
> 
> -- 
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation