[LLVMdev] RFC: Proposal for Poison Semantics

Sanjoy Das sanjoy at playingwithpointers.com
Thu Jan 29 21:43:07 PST 2015


One way around this is to say that there are some special
instructions, icmp, sext and zext which produce a value solely
composed of poison bits if any of their input bits is poison.  So
`<poison> icmp X` is poison for any value of X, including INT_MAX.
This is one way poison could be fundamentally different from undef.

-- Sanjoy

On Thu, Jan 29, 2015 at 8:05 PM, Matthias Braun <matze at braunis.de> wrote:
> Having though about this some more I think optimizing
>
> (x+1 > x) <=> true
>
> and at the same time modeling undefined behavior as a posion value is impossible. This is because we also want the following rule:
>
> (x > INT_MAX) <=> false
>
> Now if poison is a value, then the second replacement tells us (poison > INT_MAX) == false which contradicts the first rule.
>
>
>
> The only way out of this while still allowing (x+1>x)<=>true I can see at the moment is defining that add nsw does produce actual undefined behavior allowing us to “freely rewrite" the following > to true in the UB cases. Of course having add produce actual undefined behaviour greatly limits us in the way we can actually move the instruction around without changing program semantics.
>
> The only way allowing moving instructions and having add produce real UB I can see is that as soon as we start moving instructions around (specifically moving the add to a place it does not dominate or moving any other instruction over the add) we change the add to an instruction that does not produce real UB anymore; something like add swu = “add signed wrap gives undef”...
>
> - Matthias
>
>> On Jan 29, 2015, at 11:29 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>>
>>> I've been discussing a model with David that might steer poison back towards
>>> something that simply supports algebraic simplification. If we have a math
>>> operation that cannot wrap, then it notionally produces as many bits of
>>> undef as the operation could possibly produce. For example, "add nsw i8" can
>>> produce an i9 undef, and "mul nsw i8" can produce an undefined 16 bit
>>> bitpattern. This is strong enough to do things like "a + 1 > a  -->  true",
>>> because on overflow of "a + 1" we can choose an poison value of "MAX_INT +
>>> 1", even though that is not a valid i8 bit pattern.
>>>
>>> So, a short version would be that poison is like undef, except you get to
>>> include the overflow bits of the computation in your undef value.
>>
>> I  suspect it will be hard to justify reg2mem is meaning preserving in
>> this scheme.  But if this can be made to work, then I agree that this
>> is the right thing to do -- an i32 poison effectively has the
>> semantics of a wider type, and it makes sense to be explicit in that.
>>
>> -- Sanjoy
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>




More information about the llvm-dev mailing list