[LLVMdev] RFC: Proposal for Poison Semantics

Chandler Carruth chandlerc at google.com
Sun Feb 1 02:19:04 PST 2015


On Sun, Feb 1, 2015 at 1:57 AM, David Majnemer <david.majnemer at gmail.com>
wrote:

>
> On Tue, Jan 27, 2015 at 8:58 PM, Sanjoy Das <
> sanjoy at playingwithpointers.com> wrote:
>
>> > Ah, yes.  You are right, we cannot always assume that %y would be zero
>> in
>> > the second case.
>> > This wouldn't be the first time we've lost information that we could
>> use to
>> > optimize a program by transforming it.
>> >
>> > Do you think this result would be problematic?  It seems consistent
>> with the
>> > RFC and LLVM's current behavior.
>> >
>>
>> The problem is not that we're losing information, the problem is that
>> we're changing the behavior of a well-defined program.
>>
>> I'll try to put the whole argument in one place:
>>
>> We start with
>>
>>   %x = add nuw i32 %m, %n
>>   %y = zext i32 %x to i64
>>   %s = lshr i64 %y, 32
>>   %addr = gep %some_global, %s
>>   store i32 42, i32* %addr
>>
>> In the above program, for all values of %x, %s is 0.  This means the
>> program is well-defined when %x is poison (since you don't need to
>> look at %x to determine the value of %addr, in the same sense as you
>> don't need to look at X to determine the value of "and X, 0"); and it
>> stores 42 to &(%some_global)[0].  Specifically, the above program is
>> well defined for "%m = %n = 2^32-1".
>>
>> Now if we do the usual transform of "zext (add nuw X Y)" => "add nuw
>> (zext X) (zext Y)" then we get
>>
>>   %m.wide = zext i32 %m to i64
>>   %n.wide = zext i32 %n to i64
>>   %z = add nuw i64 %m.wide, %n.wide
>>   %s = lshr i64 %y, 32
>>   %addr = gep %some_global, %s
>>   store i32 42, i32* %addr
>>
>> The new program does *not* have the same behavior as the old program
>> for "%m = %n = 2^32-1".  We have changed the behavior of a
>> well-defined program by doing the "zext (add nuw X Y)" => "add nuw
>> (zext X) (zext Y)" transform.
>>
>
> After some pondering and combing through LLVM's implementation, I think we
> must conclude that zexting a value with any poison bits creates poison in
> every new bit.
>
> Considering the following program:
>
> %zext = zext i32 %x to i64
> %icmp = icmp i64 %zext, i64 1
>
> we'd like to transform this to:
>
> %icmp = icmp i32 %x, i32 1
>
> Is it reasonable to say that '%icmp' in the before case is not poison but
> '%icmp' in the after case is poison?  LLVM assumes it can remove casts with
> impunity, I think this is a useful property to maintain.
>

FWIW, I agree with your statement.

Here is the line of reasoning that I find troubling.

If we accept the above, we have a surprising result (using small bit-width
integers to make it easier to read)

%zext = zext i1 %x to i2
%and = and i2 %zext, 1

We cannot replace %and with %zext because the %and might be removing poison.

Perhaps this restriction is OK though. I just find it somewhat troubling.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150201/dbeb25d7/attachment.html>


More information about the llvm-dev mailing list