[LLVMdev] RFC: Proposal to Remove Poison

Sun Feb 8 08:38:47 PST 2015

----- Original Message -----
> From: "David Majnemer" <david.majnemer at gmail.com>
> To: llvmdev at cs.uiuc.edu
> Cc: "Nuno Lopes" <nuno.lopes at ist.utl.pt>, "John Regehr" <regehr at cs.utah.edu>
> Sent: Sunday, February 8, 2015 4:23:15 AM
> Subject: [LLVMdev] RFC: Proposal to Remove Poison
> 
> Hello,
> 
> 
> I'd like to offer an alternative solution to the "poison problem":
> remove it.
> 
> 
> What follows is rather informal. I'd happily write up a nicer
> document if this RFC stands up to scrutiny.
> 
> 
> The idea was born from two observations:
> - undef was introduced to model a load of uninitialized memory, a
> form of undefined behavior.
> - poison was introduced to model integer overflow, another form of
> undefined behavior.
> 
> 
> I find it awkward that one form of uninitialized behavior is more or
> less powerful than another.
> 
> 
> We rely on the following properties for undef:
> - undef is permitted to be an arbitrary bit-pattern.
> - Instructions may or may not be value dependent on their undef
> operands.
> 
> 
> 
> We rely on the following properties for poison:
> - Instructions which have poison operands are capable of providing
> results which are not consistent with a two's complement operand.
> 
> 
> Here is a case where we believed distinguishing between poison and
> undef is essential:
> %add = add nsw i32 %a, %b
> %cmp = icmp sgt i32 %add, %a
> 
> 
> If %a is INT_MAX and %b is 1, then %add results in overflow.
> 
> If overflow results in undef, then %cmp would be equivalent to:
> %cmp = icmp sgt i32 undef, INT_MIN
> 
> 
> There is no bit-pattern we could substitute for undef which would
> make %cmp true, this means that we cannot optimize %cmp to 'icmp sgt
> i32 %b, 0'.
> 
> 
> Poison was created to allow us to produce whatever value we'd like
> for %cmp if %add resulted in overflow.
> 
> 
> What I would like to do is to remove poison and replace it with undef
> while still optimizing comparison instructions. The quick-and-dirty
> way of stating this is: an icmp with an undef operand produces undef
> regardless of its other operand.

Currently, we might replace:
  %cmp = icmp sgt i32 undef, INT_MAX -> i1 false

According to this proposal, we could replace:
 %cmp = icmp sgt i32 undef, INT_MAX -> i1 undef

While replacing it with false is still allowed, so is replacing it with true. I'm not sure this makes sense.

> 
> 
> I believe this would allow us to exploit and optimize overflow in
> arithmetic flowing into comparison operations.
> 
> 
> Could this be problematic?
> 
> 
> Whether we intended to or not, LLVM's implementation of undef's
> semantics already allows us to say that a memory location is not
> equivalent to any possible bit-pattern.
> 
> 
> 
> Consider the following program:
> 
> define i1 @f() {
> %mem = alloca i1
> %load = load i1* %mem
> %cmp = icmp eq i1 %load, false
> ret i1 %cmp
> }
> 
> 
> Because it is possible for %load's undef to be either true or false,
> %cmp must be equivalent to undef as well. This is completely
> consistent with the above rules.
> 
> 
> The following program is a little more interesting:
> 
> define i1 @g() {
> %mem = alloca i1
> %load = load i1* %mem
> %cmp0 = icmp eq i1 %load, 0
> %cmp1 = icmp eq i1 %load, 1
> %and = xor i1 %cmp0, %cmp1
> ret i1 %and
> }
> 
> 
> The intent of this program is to compare a memory location against
> all possible values the location might have.
> 
> If we ran LLVM's InstCombine pass then %and would have been replaced
> with the "expected" result of true.
> 
> 
> 
> If we instead ran SROA over @g, we would be left with:
> 
> 
> %cmp0 = icmp eq i1 undef, false
> %cmp1 = icmp eq i1 undef, true
> %and = xor i1 %cmp0, %cmp1
> 
> 
> Now that %cmp0 and %cmp1 have different undef values, %and is now
> undef.
> This result is sufficient to say that the contents of %mem are
> neither true nor false!

In the past, this has been explained to me in the following way: undef values have a definitive bit pattern per use, consistent with all relevant constraints, but may have a different pattern at each use. Thus, the reason that the SROA transformation is allowable is that the undef is 'used' twice, once in each comparison, and is otherwise unconstrained, and so make take on a different value (0 or 1 here) at each comparison.

That having been said, this is also somewhat loose, because IR-level operations have no atomicity. An i32 add could be replaced by i16 operations (that would 'use' the undef input multiple times).

 -Hal

> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory