[LLVMdev] RFC: Proposal for Poison Semantics

Sanjoy Das sanjoy at playingwithpointers.com
Thu Jan 29 22:05:52 PST 2015


On Thu, Jan 29, 2015 at 10:01 PM, Matthias Braun <matze at braunis.de> wrote:
> But
> (Poison > INT_MAX) <=> poison
> contradicts
> (X > INT_MAX) <=> false
>
> and I don't think you want to abandon the second rule just because x might be poison.

Maybe we could define poison in such a way that it is safe to pretend
it "is" false, as per our convenience.  In this sense, could be
defined to be very similar to undef.

-- Sanjoy

>
> - Matthias
>
>> On Jan 29, 2015, at 9:43 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>>
>> One way around this is to say that there are some special
>> instructions, icmp, sext and zext which produce a value solely
>> composed of poison bits if any of their input bits is poison.  So
>> `<poison> icmp X` is poison for any value of X, including INT_MAX.
>> This is one way poison could be fundamentally different from undef.
>>
>> -- Sanjoy
>>
>>> On Thu, Jan 29, 2015 at 8:05 PM, Matthias Braun <matze at braunis.de> wrote:
>>> Having though about this some more I think optimizing
>>>
>>> (x+1 > x) <=> true
>>>
>>> and at the same time modeling undefined behavior as a posion value is impossible. This is because we also want the following rule:
>>>
>>> (x > INT_MAX) <=> false
>>>
>>> Now if poison is a value, then the second replacement tells us (poison > INT_MAX) == false which contradicts the first rule.
>>>
>>>
>>>
>>> The only way out of this while still allowing (x+1>x)<=>true I can see at the moment is defining that add nsw does produce actual undefined behavior allowing us to “freely rewrite" the following > to true in the UB cases. Of course having add produce actual undefined behaviour greatly limits us in the way we can actually move the instruction around without changing program semantics.
>>>
>>> The only way allowing moving instructions and having add produce real UB I can see is that as soon as we start moving instructions around (specifically moving the add to a place it does not dominate or moving any other instruction over the add) we change the add to an instruction that does not produce real UB anymore; something like add swu = “add signed wrap gives undef”...
>>>
>>> - Matthias
>>>
>>>>> On Jan 29, 2015, at 11:29 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>>>>>
>>>>> I've been discussing a model with David that might steer poison back towards
>>>>> something that simply supports algebraic simplification. If we have a math
>>>>> operation that cannot wrap, then it notionally produces as many bits of
>>>>> undef as the operation could possibly produce. For example, "add nsw i8" can
>>>>> produce an i9 undef, and "mul nsw i8" can produce an undefined 16 bit
>>>>> bitpattern. This is strong enough to do things like "a + 1 > a  -->  true",
>>>>> because on overflow of "a + 1" we can choose an poison value of "MAX_INT +
>>>>> 1", even though that is not a valid i8 bit pattern.
>>>>>
>>>>> So, a short version would be that poison is like undef, except you get to
>>>>> include the overflow bits of the computation in your undef value.
>>>>
>>>> I  suspect it will be hard to justify reg2mem is meaning preserving in
>>>> this scheme.  But if this can be made to work, then I agree that this
>>>> is the right thing to do -- an i32 poison effectively has the
>>>> semantics of a wider type, and it makes sense to be explicit in that.
>>>>
>>>> -- Sanjoy
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list