[LLVMdev] RFC: Proposal for Poison Semantics
David Majnemer
david.majnemer at gmail.com
Tue Jan 27 22:29:34 PST 2015
On Tue, Jan 27, 2015 at 10:18 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>
> > On Jan 27, 2015, at 6:50 PM, David Majnemer <david.majnemer at gmail.com>
> wrote:
> >
> > Hello,
> >
> > What follows is my attempt to describe how poison works. Let me know
> what you think.
> >
> > --
> > David
> >
> >
> > # LLVM Poison Semantics
> >
> > Poison is an LLVM concept which exists solely to enable further
> optimization of LLVM IR. The exact behavior of poison has been, to say the
> least, confusing for users, researchers and engineers working with LLVM.
> >
> > This document hopes to clear up some of the confusion of poison and
> hopefully explain *why* it has its semantics.
> >
> > ## A Quick Introduction to Poison
> >
> > Let's start with a concrete motivating example in C:
> > ```
> > int isSumGreater(int a, int b) {
> > return a + b > a;
> > }
> > ```
> >
> > The C specification permits us to optimize the comparison in
> `isSumGreater` to `b > 0` because signed overflow results in undefined
> behavior. A reasonable translation of `isSumGreater` to LLVM IR could be:
>
> Typo:
> to `b > 0`
> should be
> to `a + b > 0`
> IIUC.
>
> I mention in case you intend to publish this in the doc.
>
I intended what I wrote. We perform this optimization in InstCombine
today:
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp?revision=226783&view=markup#l3193
>
>
> —
> Mehdi
>
>
>
>
> >
> > ```
> > define i32 @isSumGreater(i32 %a, i32 %b) {
> > entry:
> > %add = add i32 %a, %b
> > %cmp = icmp sgt i32 %add, %a
> > %conv = zext i1 %cmp to i32
> > ret i32 %conv
> > }
> > ```
> >
> > However, LLVM cannot determine that `%cmp` should not consider cases
> where `%add` resulted in signed overflow. We need a way to communicate
> this information to LLVM.
> >
> > This is where the `nsw` and `nuw` flags come into play. `nsw` is short
> for "no signed wrap", `nuw` is short for "no unsigned wrap".
> >
> > With these, we can come up with a new formulation of `%add`: `add i32
> nsw %a, %b`.
> > LLVM can take this into account when it is optimizing the `%cmp` and
> replace it with: `icmp sgt i32 %b, 0`.
> >
> > ## Differences Between LLVM and C/C++
> >
> > There are some interesting differences between what C++ and C specify
> and how LLVM behaves with respect to performing an operationg which is not
> permitted to overflow.
> >
> > Perhaps chief among them is that evaluating an expression in C++ or C
> which results performs an overflow is undefined behavior. In LLVM,
> executing an instruction which is marked `nsw` but which violates signed
> overflow results in poison. Values which have no relationship with poisoned
> values are not effected by them.
> >
> > Let us take the following C program into consideration:
> > ```
> > int calculateImportantResult(int a, int b) {
> > int result = 0;
> > if (a) {
> > result = a + b;
> > }
> > return result;
> > }
> > ```
> >
> > A straightforward lowering to LLVM IR could be:
> > ```
> > define i32 @calculateImportantResult(i32 %a, i32 %b) {
> > entry:
> > %tobool = icmp ne i32 %a, 0
> > br i1 %tobool, label %if.then, label %if.end
> >
> > if.then:
> > %add = add nsw i32 %a, %b
> > br label %if.end
> >
> > if.end:
> > %result = phi i32 [ %add, %if.then ], [ 0, %entry ]
> > ret i32 %result
> > }
> > ```
> >
> > Moving `%add` to the `%entry` block would be preferable and would allow
> further optimizations:
> > ```
> > define i32 @calculateImportantResult(i32 %a, i32 %b) {
> > entry:
> > %tobool = icmp ne i32 %a, 0
> > %add = add nsw i32 %a, %b
> > %result = select i1 %tobool, i32 0, i32 %add
> > ret i32 %result
> > }
> > ```
> >
> > In the original code, the calculation of `%add` was control dependent.
> > Now, `%add` might result in signed overflow in violation of the `nsw`
> flag.
> > Despite this, the program should behave as it did before because the
> poisoned value is masked-out by the select. The next section will dive into
> this in greater detail.
> >
> > # Computation Involving Poison Values
> > Poison in a computation results in poison if the result cannot be
> constrained by its non-poison operands.
> >
> > Examples of this rule which will result in poison:
> > ```
> > %add = add i32 %x, %always_poison
> > %sub = sub i32 %x, %always_poison
> > %xor = xor i32 %x, %always_poison
> > %mul = mul i32 %always_poison, 1
> > ```
> >
> > Examples of this rule which do not result in poison:
> > ```
> > %or = or i32 %always_poison, 2
> > %and = and i32 %always_poison, 2
> > %mul = mul i32 %always_poison, 0
> > ```
> >
> > In fact, it would be reasonable to optimize `%or` to `2` and `%and` to
> `0`. In this respect, poison is not different from `undef`.
> >
> > The following example is only poison if `%cond` is false:
> > ```
> > %sel = select i1 %cond, i32 2, %always_poison
> > ```
> >
> > ### Is it safe to have poison as a `call` argument?
> >
> > A `call` instruction may or may not result in poison depending on
> exactly how the callee uses the supplied arguments, it is not necessarily
> the case that `call i32 @someFunction(i32 %always_poison)` results in
> poison.
> >
> > LLVM cannot forbid poison from entering `call` arguments without
> prohibiting an optimization pass from outlining code.
> >
> > ### Is it safe to store poison to memory?
> >
> > `store i32 %always_poison, i32* %mem` does not result in undefined
> behavior. A subsequent load instruction like `%load = load i32* %mem` will
> result in `%load` being a poison value.
> >
> > ### Is it safe to load or store a poison memory location?
> >
> > No. Poison works just like `undef` in this respect.
> >
> > ### Does comparing a poison value result in poison?
> >
> > It depends. If the comparison couldn't solely be determined by looking
> at the other operand, the result is poison.
> >
> > For example, `icmp i32 ule %always_poison, 4294967295` is `true`, not
> poison.
> > However, `icmp i32 ne %always_poison, 7` is poison.
> >
> > ### What if the condition operand in a `select` is poison?
> >
> > In the example `%sel = select i1 %always_poison, i1 true, false`, `%sel`
> is either `true` or `false`. Because, `%sel` depends on `%always_poison`
> it too is poison.
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150127/e107065a/attachment.html>
More information about the llvm-dev
mailing list