[LLVMdev] Proposal for Poison Semantics

Wed Feb 4 09:09:36 PST 2015

Hi all, sorry to come late to the party. This is a great thread. I wanted 
to toss out an example I ran across the other day that seems somewhat 
interesting. We start with this C program that executes no UB:

int foo(int x1, int x2) {
   return
     ((x2 > 0) && (x1 > (2147483647 - x2))) ||
     (x1 && (x2 < 0));
}

int main(void) {
   printf("%x\n", foo(1, -770027279));
   return 0;
}

Clang+LLVM turns foo() into this:

define i32 @foo(i32 %x1, i32 %x2) #0 {
   %1 = icmp sgt i32 %x2, 0
   %2 = sub nsw i32 2147483647, %x2
   %3 = icmp slt i32 %2, %x1
   %or.cond = and i1 %1, %3
   br i1 %or.cond, label %6, label %4

; <label>:4                                       ; preds = %0
   %5 = icmp slt i32 %x2, 0
   %not. = icmp ne i32 %x1, 0
   %. = and i1 %5, %not.
   br label %6

; <label>:6                                       ; preds = %4, %0
   %7 = phi i1 [ true, %0 ], [ %., %4 ]
   %8 = zext i1 %7 to i32
   ret i32 %8
}

In the semantics described by the LangRef, my interpretation of the
execution is this:

%1 = false
%2 = poison
%3 = poison
%or.cond = poison
... the br goes either way or both ways or something ...
%5 = true
%not. = true
%. = true
%7 = phi true, true

So all is good despite the non-deterministic execution.

In the proposed new model, we don't have to branch on poison since
%or.cond is completely determined by its false argument and is therefore
false. Does that all sound right?

John