[LLVMdev] RFC: Proposal for Poison Semantics

Philip Reames listmail at philipreames.com
Wed Jan 28 20:53:12 PST 2015


On 01/28/2015 07:02 AM, Sean Silva wrote:
> Could you maybe provide an example where replacing `%always_poison` 
> with `undef` will change the meaning? At least for me, the thing that 
> I'm most unclear about is how poison differs from undef.
I will second this request for much the same reason.
>
> -- Sean Silva
>
> On Wed, Jan 28, 2015 at 2:50 AM, David Majnemer 
> <david.majnemer at gmail.com <mailto:david.majnemer at gmail.com>> wrote:
>
>     Hello,
>
>     What follows is my attempt to describe how poison works.  Let me
>     know what you think.
>
>     -- 
>     David
>
>
>     # LLVM Poison Semantics
>
>     Poison is an LLVM concept which exists solely to enable further
>     optimization of LLVM IR. The exact behavior of poison has been, to
>     say the least, confusing for users, researchers and engineers
>     working with LLVM.
>
>     This document hopes to clear up some of the confusion of poison
>     and hopefully explain *why* it has its semantics.
>
>     ## A Quick Introduction to Poison
>
>     Let's start with a concrete motivating example in C:
>     ```
>     int isSumGreater(int a, int b) {
>       return a + b > a;
>     }
>     ```
>
>     The C specification permits us to optimize the comparison in
>     `isSumGreater` to `b > 0` because signed overflow results in
>     undefined behavior.  A reasonable translation of `isSumGreater` to
>     LLVM IR could be:
>
>     ```
>     define i32 @isSumGreater(i32 %a, i32 %b) {
>     entry:
>       %add = add i32 %a, %b
>       %cmp = icmp sgt i32 %add, %a
>       %conv = zext i1 %cmp to i32
>       ret i32 %conv
>     }
>     ```
>
>     However, LLVM cannot determine that `%cmp` should not consider
>     cases where `%add` resulted in signed overflow.  We need a way to
>     communicate this information to LLVM.
>
>     This is where the `nsw` and `nuw` flags come into play.  `nsw` is
>     short for "no signed wrap", `nuw` is short for "no unsigned wrap".
>
>     With these, we can come up with a new formulation of `%add`: `add
>     i32 nsw %a, %b`.
>     LLVM can take this into account when it is optimizing the `%cmp`
>     and replace it with: `icmp sgt i32 %b, 0`.
>
>     ## Differences Between LLVM and C/C++
>
>     There are some interesting differences between what C++ and C
>     specify and how LLVM behaves with respect to performing an
>     operationg which is not permitted to overflow.
>
>     Perhaps chief among them is that evaluating an expression in C++
>     or C which results performs an overflow is undefined behavior. In
>     LLVM, executing an instruction which is marked `nsw` but which
>     violates signed overflow results in poison. Values which have no
>     relationship with poisoned values are not effected by them.
>
>     Let us take the following C program into consideration:
>     ```
>     int calculateImportantResult(int a, int b) {
>       int result = 0;
>       if (a) {
>         result = a + b;
>       }
>       return result;
>     }
>     ```
>
>     A straightforward lowering to LLVM IR could be:
>     ```
>     define i32 @calculateImportantResult(i32 %a, i32 %b) {
>     entry:
>       %tobool = icmp ne i32 %a, 0
>       br i1 %tobool, label %if.then, label %if.end
>
>     if.then:
>       %add = add nsw i32 %a, %b
>       br label %if.end
>
>     if.end:
>       %result = phi i32 [ %add, %if.then ], [ 0, %entry ]
>       ret i32 %result
>     }
>     ```
>
>     Moving `%add` to the `%entry` block would be preferable and would
>     allow further optimizations:
>     ```
>     define i32 @calculateImportantResult(i32 %a, i32 %b) {
>     entry:
>       %tobool = icmp ne i32 %a, 0
>       %add = add nsw i32 %a, %b
>       %result = select i1 %tobool, i32 0, i32 %add
>       ret i32 %result
>     }
>     ```
>
>     In the original code, the calculation of `%add` was control dependent.
>     Now, `%add` might result in signed overflow in violation of the
>     `nsw` flag.
>     Despite this, the program should behave as it did before because
>     the poisoned value is masked-out by the select. The next section
>     will dive into this in greater detail.
>
>     # Computation Involving Poison Values
>     Poison in a computation results in poison if the result cannot be
>     constrained by its non-poison operands.
>
>     Examples of this rule which will result in poison:
>     ```
>       %add = add i32 %x, %always_poison
>       %sub = sub i32 %x, %always_poison
>       %xor = xor i32 %x, %always_poison
>       %mul = mul i32 %always_poison, 1
>     ```
>
>     Examples of this rule which do not result in poison:
>     ```
>       %or  = or  i32 %always_poison, 2
>       %and = and i32 %always_poison, 2
>       %mul = mul i32 %always_poison, 0
>     ```
>
>     In fact, it would be reasonable to optimize `%or` to `2` and
>     `%and` to `0`.  In this respect, poison is not different from `undef`.
>
>     The following example is only poison if `%cond` is false:
>     ```
>       %sel = select i1 %cond, i32 2, %always_poison
>     ```
>
>     ### Is it safe to have poison as a `call` argument?
>
>     A `call` instruction may or may not result in poison depending on
>     exactly how the callee  uses the supplied arguments, it is not
>     necessarily the case that `call i32 @someFunction(i32
>     %always_poison)` results in poison.
>
>     LLVM cannot forbid poison from entering `call` arguments without
>     prohibiting an optimization pass from outlining code.
>
>     ### Is it safe to store poison to memory?
>
>     `store i32 %always_poison, i32* %mem` does not result in undefined
>     behavior. A subsequent load instruction like `%load = load i32*
>     %mem` will result in `%load` being a poison value.
>
>     ### Is it safe to load or store a poison memory location?
>
>     No.  Poison works just like `undef` in this respect.
>
>     ### Does comparing a poison value result in poison?
>
>     It depends.  If the comparison couldn't solely be determined by
>     looking at the other operand, the result is poison.
>
>     For example, `icmp i32 ule %always_poison, 4294967295` is `true`,
>     not poison.
>     However, `icmp i32 ne %always_poison, 7` is poison.
>
>     ### What if the condition operand in a `select` is poison?
>
>     In the example `%sel = select i1 %always_poison, i1 true, false`,
>     `%sel` is either `true` or `false`. Because, `%sel` depends on
>     `%always_poison` it too is poison.
>
>     _______________________________________________
>     LLVM Developers mailing list
>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>     http://llvm.cs.uiuc.edu
>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150128/9d1a9775/attachment.html>


More information about the llvm-dev mailing list