[LLVMdev] Deriving undefined behavior from nsw/inbounds/poison for scalar evolution

Wed Jul 1 11:50:49 PDT 2015

> On Jun 30, 2015, at 7:27 PM, Bjarke Roune <broune at google.com> wrote:
> 
> Hi Sanjoy, thanks for your thoughts on this.
> 
> On Sat, Jun 27, 2015 at 12:16 AM, Sanjoy Das <sanjoy at playingwithpointers.com <mailto:sanjoy at playingwithpointers.com>> wrote:
> First of all, going by the "poison causes UB only when observed", SCEV
> does not do the right thing currently: [...]
> 
> That seems like a bug? There's also bug 23527 for GEP. Sounds like there might be more such bugs.
> 
> One way to get what you want and also fix this existing bug(?) is to
> make the no wrap flags part of a SCEV expression's identity (i.e. make
> {A,+,B}<nuw> a different expression from {A,+,B}).  If you start with
> the precondition that every <nuw> [0] came from a <nuw> [1] present in
> the IR, then you don't really have to worry about control dependence
> -- you can always optimize under the assumption that the SCEV
> expression does not unsign-overflow; if such an optimization changes
> program behavior then that program has UB since it was data dependent
> on poison [2].
> 
> To make sure that I understand correctly, are you suggesting making the nuw (and similar) flags part of the key when looking up an SCEV in the UniqueSCEVs member variable on ScalarEvolution? That way, an instruction with nuw and one without will not map to the same SCEV unless the nuw can be inferred in some other way. Sounds good to me, I'm happy to go with either one of that or inferring UB from poison.
> 
> Adding the flags to the key/identity does break the idea that mathematically equivalent expressions should cancel if subtracted. Andrew Trick wrote something about that previously:
> 
>   http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141215/249390.html <http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141215/249390.html>
> 
> Andrew: I wonder what your view is on this?
>   

In that thread I was pointing out that with no-wrap flags as part of the expression identity, SCEV loses the ability to reason about or reduce expressions that have a mix of flags. Trivial example:
{0,+,1}<nsw> - {1,+,1}

We can no longer reduce this to constant 1.

OTOH, I've never actually tried this. If the nsw key is limited to recurrences, it might not be terrible in practice. I'm guessing we would at least end up with redundant induction variables though. I won't rule out the idea of adding nsw to the SCEV key, but I'm afraid it will lead to more complexity in the long run.

I think your approach to strengthening SCEV by using a poison value analysis is legitimate. I've considered this approach in the past. I do think that it adds complexity, but is probably the best way to reuse existing LSR. I can see a few alternatives to solving your problem:

(1) Use poison value analysis to inform SCEV. SCEV now sees nsw flags on your recurrence and reduces the expression for you. LSR does the transformation you want.

(2) Add nsw flags to the SCEV key so we don't need poison value analysis. I think this could be problematic, but if Sanjoy really wants to go this route and support the work I won't get in the way.

(3) Write a *simple* IR-level analysis that refactors address expressions to hoist the loop invariant component, informed by the target's addressing mode. This could be a first step in replacing the LSR pass with something more efficient and predictable. This is the approach I would take if I were doing the work. But I can understand not wanting to design a new pass just to handle your case.

Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150701/e943aefc/attachment.html>