[LLVMdev] RFC: Proposal for Poison Semantics
Philip Reames
listmail at philipreames.com
Wed Jan 28 20:53:12 PST 2015
On 01/28/2015 07:02 AM, Sean Silva wrote:
> Could you maybe provide an example where replacing `%always_poison`
> with `undef` will change the meaning? At least for me, the thing that
> I'm most unclear about is how poison differs from undef.
I will second this request for much the same reason.
>
> -- Sean Silva
>
> On Wed, Jan 28, 2015 at 2:50 AM, David Majnemer
> <david.majnemer at gmail.com <mailto:david.majnemer at gmail.com>> wrote:
>
> Hello,
>
> What follows is my attempt to describe how poison works. Let me
> know what you think.
>
> --
> David
>
>
> # LLVM Poison Semantics
>
> Poison is an LLVM concept which exists solely to enable further
> optimization of LLVM IR. The exact behavior of poison has been, to
> say the least, confusing for users, researchers and engineers
> working with LLVM.
>
> This document hopes to clear up some of the confusion of poison
> and hopefully explain *why* it has its semantics.
>
> ## A Quick Introduction to Poison
>
> Let's start with a concrete motivating example in C:
> ```
> int isSumGreater(int a, int b) {
> return a + b > a;
> }
> ```
>
> The C specification permits us to optimize the comparison in
> `isSumGreater` to `b > 0` because signed overflow results in
> undefined behavior. A reasonable translation of `isSumGreater` to
> LLVM IR could be:
>
> ```
> define i32 @isSumGreater(i32 %a, i32 %b) {
> entry:
> %add = add i32 %a, %b
> %cmp = icmp sgt i32 %add, %a
> %conv = zext i1 %cmp to i32
> ret i32 %conv
> }
> ```
>
> However, LLVM cannot determine that `%cmp` should not consider
> cases where `%add` resulted in signed overflow. We need a way to
> communicate this information to LLVM.
>
> This is where the `nsw` and `nuw` flags come into play. `nsw` is
> short for "no signed wrap", `nuw` is short for "no unsigned wrap".
>
> With these, we can come up with a new formulation of `%add`: `add
> i32 nsw %a, %b`.
> LLVM can take this into account when it is optimizing the `%cmp`
> and replace it with: `icmp sgt i32 %b, 0`.
>
> ## Differences Between LLVM and C/C++
>
> There are some interesting differences between what C++ and C
> specify and how LLVM behaves with respect to performing an
> operationg which is not permitted to overflow.
>
> Perhaps chief among them is that evaluating an expression in C++
> or C which results performs an overflow is undefined behavior. In
> LLVM, executing an instruction which is marked `nsw` but which
> violates signed overflow results in poison. Values which have no
> relationship with poisoned values are not effected by them.
>
> Let us take the following C program into consideration:
> ```
> int calculateImportantResult(int a, int b) {
> int result = 0;
> if (a) {
> result = a + b;
> }
> return result;
> }
> ```
>
> A straightforward lowering to LLVM IR could be:
> ```
> define i32 @calculateImportantResult(i32 %a, i32 %b) {
> entry:
> %tobool = icmp ne i32 %a, 0
> br i1 %tobool, label %if.then, label %if.end
>
> if.then:
> %add = add nsw i32 %a, %b
> br label %if.end
>
> if.end:
> %result = phi i32 [ %add, %if.then ], [ 0, %entry ]
> ret i32 %result
> }
> ```
>
> Moving `%add` to the `%entry` block would be preferable and would
> allow further optimizations:
> ```
> define i32 @calculateImportantResult(i32 %a, i32 %b) {
> entry:
> %tobool = icmp ne i32 %a, 0
> %add = add nsw i32 %a, %b
> %result = select i1 %tobool, i32 0, i32 %add
> ret i32 %result
> }
> ```
>
> In the original code, the calculation of `%add` was control dependent.
> Now, `%add` might result in signed overflow in violation of the
> `nsw` flag.
> Despite this, the program should behave as it did before because
> the poisoned value is masked-out by the select. The next section
> will dive into this in greater detail.
>
> # Computation Involving Poison Values
> Poison in a computation results in poison if the result cannot be
> constrained by its non-poison operands.
>
> Examples of this rule which will result in poison:
> ```
> %add = add i32 %x, %always_poison
> %sub = sub i32 %x, %always_poison
> %xor = xor i32 %x, %always_poison
> %mul = mul i32 %always_poison, 1
> ```
>
> Examples of this rule which do not result in poison:
> ```
> %or = or i32 %always_poison, 2
> %and = and i32 %always_poison, 2
> %mul = mul i32 %always_poison, 0
> ```
>
> In fact, it would be reasonable to optimize `%or` to `2` and
> `%and` to `0`. In this respect, poison is not different from `undef`.
>
> The following example is only poison if `%cond` is false:
> ```
> %sel = select i1 %cond, i32 2, %always_poison
> ```
>
> ### Is it safe to have poison as a `call` argument?
>
> A `call` instruction may or may not result in poison depending on
> exactly how the callee uses the supplied arguments, it is not
> necessarily the case that `call i32 @someFunction(i32
> %always_poison)` results in poison.
>
> LLVM cannot forbid poison from entering `call` arguments without
> prohibiting an optimization pass from outlining code.
>
> ### Is it safe to store poison to memory?
>
> `store i32 %always_poison, i32* %mem` does not result in undefined
> behavior. A subsequent load instruction like `%load = load i32*
> %mem` will result in `%load` being a poison value.
>
> ### Is it safe to load or store a poison memory location?
>
> No. Poison works just like `undef` in this respect.
>
> ### Does comparing a poison value result in poison?
>
> It depends. If the comparison couldn't solely be determined by
> looking at the other operand, the result is poison.
>
> For example, `icmp i32 ule %always_poison, 4294967295` is `true`,
> not poison.
> However, `icmp i32 ne %always_poison, 7` is poison.
>
> ### What if the condition operand in a `select` is poison?
>
> In the example `%sel = select i1 %always_poison, i1 true, false`,
> `%sel` is either `true` or `false`. Because, `%sel` depends on
> `%always_poison` it too is poison.
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
> http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150128/9d1a9775/attachment.html>
More information about the llvm-dev
mailing list