[LLVMdev] RFC: Proposal for Poison Semantics

David Majnemer david.majnemer at gmail.com
Sun Feb 1 01:57:50 PST 2015


On Tue, Jan 27, 2015 at 8:58 PM, Sanjoy Das <sanjoy at playingwithpointers.com>
wrote:

> > Ah, yes.  You are right, we cannot always assume that %y would be zero in
> > the second case.
> > This wouldn't be the first time we've lost information that we could use
> to
> > optimize a program by transforming it.
> >
> > Do you think this result would be problematic?  It seems consistent with
> the
> > RFC and LLVM's current behavior.
> >
>
> The problem is not that we're losing information, the problem is that
> we're changing the behavior of a well-defined program.
>
> I'll try to put the whole argument in one place:
>
> We start with
>
>   %x = add nuw i32 %m, %n
>   %y = zext i32 %x to i64
>   %s = lshr i64 %y, 32
>   %addr = gep %some_global, %s
>   store i32 42, i32* %addr
>
> In the above program, for all values of %x, %s is 0.  This means the
> program is well-defined when %x is poison (since you don't need to
> look at %x to determine the value of %addr, in the same sense as you
> don't need to look at X to determine the value of "and X, 0"); and it
> stores 42 to &(%some_global)[0].  Specifically, the above program is
> well defined for "%m = %n = 2^32-1".
>
> Now if we do the usual transform of "zext (add nuw X Y)" => "add nuw
> (zext X) (zext Y)" then we get
>
>   %m.wide = zext i32 %m to i64
>   %n.wide = zext i32 %n to i64
>   %z = add nuw i64 %m.wide, %n.wide
>   %s = lshr i64 %y, 32
>   %addr = gep %some_global, %s
>   store i32 42, i32* %addr
>
> The new program does *not* have the same behavior as the old program
> for "%m = %n = 2^32-1".  We have changed the behavior of a
> well-defined program by doing the "zext (add nuw X Y)" => "add nuw
> (zext X) (zext Y)" transform.
>

After some pondering and combing through LLVM's implementation, I think we
must conclude that zexting a value with any poison bits creates poison in
every new bit.

Considering the following program:

%zext = zext i32 %x to i64
%icmp = icmp i64 %zext, i64 1

we'd like to transform this to:

%icmp = icmp i32 %x, i32 1

Is it reasonable to say that '%icmp' in the before case is not poison but
'%icmp' in the after case is poison?  LLVM assumes it can remove casts with
impunity, I think this is a useful property to maintain.


>
> -- Sanjoy
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150201/e77adea9/attachment.html>


More information about the llvm-dev mailing list