[LLVMdev] SCEV and GEP NSW flag

Sat Nov 2 05:43:09 PDT 2013

----- Original Message -----
> 
> On Oct 31, 2013, at 1:24 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > Andy, et al.,
> > 
> > If I start with C code like this:
> > 
> > void foo(long k, int * restrict a, int * restrict b, int * restrict
> > c) {
> >  if (k > 0) {
> >    for (int i = 0; i < 2047; i++) {
> >      a[i] = a[i + k] + b[i] * c[i];
> >    }
> >  }
> > }
> > 
> > Clang -O3 will produce code like this:
> > 
> > entry:
> >  %cmp = icmp sgt i64 %k, 0
> >  br i1 %cmp, label %for.body.preheader, label %if.end
> > 
> > for.body.preheader:
> >  br label %for.body
> > 
> > for.body:
> >  %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0,
> >  %for.body.preheader ]
> >  %add = add nsw i64 %indvars.iv, %k
> >  %arrayidx = getelementptr inbounds i32* %a, i64 %add
> >  %0 = load i32* %arrayidx, align 4, !tbaa !1
> > ...
> >  %arrayidx7 = getelementptr inbounds i32* %a, i64 %indvars.iv
> >  store i32 %add5, i32* %arrayidx7, align 4, !tbaa !1
> > ...
> > 
> > My goal here is to detect that, within the loop, (&a[i] - &a[i +
> > k]) is negative, as the explicit loop guard guarantees. On the
> > path toward that goal, I've run into the following:
> > 
> > The SCEV for &a[i] looks like this: {%a,+,4}<nsw><%for.body>
> > 
> > but the SCEV for &a[i + k] looks like this: {((4 * %k) +
> > %a),+,4}<%for.body>
> > 
> > and the problem is that this expression, even though it comes from
> > a inbounds GEP, with nsw on all contributing instructions, does
> > not have the <nsw> flag. I think that the problem is the (4 * %k)
> > 'Index' expression, which lacks the no-wrap flags, and when
> > createNodeForGEP uses it to construct the overall expression for
> > the address, the result also lacks the nsw flag.
> > 
> > The reason that the (4 * %k) 'Index' expression lacks the flags is
> > explained in createSCEV:
> > 
> >    // Don't apply this instruction's NSW or NUW flags to the new
> >    // expression. The instruction may be guarded by control flow
> >    that the
> >    // no-wrap behavior depends on. Non-control-equivalent
> >    instructions can be
> >    // mapped to the same SCEV expression, and it would be incorrect
> >    to transfer
> >    // NSW/NUW semantics to those operations.
> > 
> > and I appreciate this concern, but I suspect it is supposed to be
> > possible to recover the flags on the resulting AddRec, and how is
> > that supposed to happen? One issue may be that the control flow
> > mentioned in the comment could be in the loop, although in this
> > one-BB loop with no calls, I think that can be ruled out.
> 
> I'm sorry, I don't understand why the scale and addition on an
> inbounds GEP can be considered NSW. Can we assume malloc won't span
> the high/low address boundary? 

I think that we're okay on this front. The language reference defines inbounds GEPs in terms of "infinitely precise signed arithmetic", so I think that wrapping and being inbounds are incompatible.

>Then were left with folks doing
> mmap... Well, based on my understanding, createNodeForGEP looks
> wrong--I really hope I'm the one that's wrong.

I think that the code is proabably okay: it is not forcing the result to have the nsw flag (if it did, I probably would not have written my e-mail), but applying it only to the 'address computation' itself. On the other hand, maybe that violates SCEV's general philosophy of being control-flow independent (maybe the logic that prohibits us from applying an add's nsw flag to an SCEVAddExpr should prohibit this as well).

> 
> As an aside, you could still prove no-self-wrap for this recurrence.
> If the base of the GEP is an NW recurrence (note that NSW or NUW
> implies NW), the GEP itself is inbounds, and the GEP simplifies to a
> recurrences, then I think you can infer that the recurrence for the
> GEP is also NW.

Okay, so you mean that I could force set the NW flag on the resulting GEP in that case?

> I don’t think that adds useful information to this
> query though :(
> 
> Along those lines, I’m a little baffled as to why you want an NSW
> recurrence. You already know the difference is 4*k and k > 0. Isn’t
> that enough? Is the problem that 4*k may overflow?

Ah, I forgot to explain, one problem is that I can't divide {((4 * %k) +  %a),+,4}<%for.body> by 4 to get ride of the type size. I was trying to do this in order to make it easier for isLoopEntryGuardedByCond to match the condition, but even dividing by 4 does not work if the recurrence is not <nsw> (as far as I can tell).

> 
> Maybe this sort of alias analysis needs to reason about array
> indices, not addresses.

Being able to divide by the type sizes would make this easier ;)

Thanks again,
Hal

> 
> -Andy
> 
> > 
> > Thanks again,
> > Hal
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory