[LLVMdev] Proposal for Poison Semantics

Wed Feb 4 09:20:15 PST 2015

>> Thanks David for putting up this proposal together!
>> I like the idea of having poison values behave more like undef (i.e., per bit, with run-time behavior).
>> One of the problems this proposal solves is speculation of 'a && b' into 'a & b'. Currently this is illegal (despite sometimes simplifycfg doing it anyway).
>> It also fixes bugs like http://llvm.org/PR20997
 >> 
>> The proposal doesn't say anything about branching on a poison value. I assume this should stay as the current interpretation -- that such branches should be undefined behavior (since we cannot branch to multiple places at the same time -- even if they >> would compute the same values; that's already too hard for the compiler to analyze).
> 
> The RFC intended to make branching on poison values OK.  If branching on poison wasn't OK, then we couldn't go from select to -> br/phi.

Ok, agreed. That case will be always safe. 

>> There's another caveat: it *does* seem to fix the problem described by Dan in http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-December/046152.html
>> However, it introduces a potential performance penalty: we won't be able to speculate instructions with undefined behavior whose input may be poison.
>>  
>> For example, take the following code:
>> loop:
>>   %add = add nsw %x, %y
>>   %div = udiv %z, %add
>>   … use %div only in the case that %add does not overflow and is non-zero
>>  
>> We can move the %add outside of the loop, but we cannot move the division. With the reason being that if %add overflows, then %add is poison and therefore it can take any value (in particular, it can be 0), triggering undef behavior in %div.  Therefore, we cannot freely move %div, unless we can prove that %add will never be 0 nor poison.  This sounds hard for the compiler to do, and I guess we'll have some regressions (e.g., LICM has to be more conservative). Nevertheless, I'm all for fixing poison once and for all!
> 
> Believe it or not, I already fixed this bug (PR21412). :)

Cool! :)

>> BTW, would it help if I produced a version of Alive that implements the semantics being proposed?  (with no performance guarantees for this prototype).  The cool thing is that then we can run it through our database of 300+ InstCombine optimizations and see which ones would have to be removed/fixed.
> 
> I think such a thing would be great.  However, there is a problem that the RFC wasn't aware of when it was written:
> 
> consider:
> %S = select %A, %B, undef
> 
> without us knowing anything about %A or %B, we will replace all uses of %S with %B.  This transform would be considered wrong with the RFC in mind.
> 
> If this transform was valid, there could not be any value or value-like property in LLVM with semantics more powerful than undef.  This makes me think that what LLVM *actually* implements is not poison or something like it.
> 
> On the flip side, we could say that this transform is nonsense but I'd rather not pessimize LLVM like that.

Ah, you're saying that poison is strictly stronger UB than undef. And the reason being that poison may lead to UB when used in certain operations.  Nice catch.
But we could have a simple precondition that states that this transformation is correct if %A is not any operations with nsw/nuw/exact flags. Sure, it's not as good as the situation we have today, but the current situation doesn't look very good anyway :)

I have a question though: When does poison becomes UB? On external calls and volatile stores only?  Any other visible side-effecting operations?  (at least those two have to be UB, right?)

Nuno