[llvm-dev] RFC: Killing undef and spreading poison

Fri Oct 21 13:53:47 PDT 2016

Hi Dan,

Thanks a lot for the feedback and apologies for the delay!

>> br poison -> UB
> What impact does this have on API contracts? Some library functions might currently tolerate being passed an uninitialized value (at the LLVM level; C's semantics are different). However, with the new poison, if the implementation of a function happens to do a branch on the input, the code has UB if passed an uninitialized value.

Right, I guess any LLVM-IR-level API would have to be upgraded to take the new semantics into account.
I'm not sure that 'br poison -> UB' is new rather than a clarification, but it's true that there were multiple interpretations for 'br poison'.
I'm curious, btw: did you have any concrete API in mind?

>> and 0, poison -> poison
>
> This is similar to the existing poison, and it raises similar questions. Consider the following code:
> 
>   %x = call i32 @foo()
>   %y = or i32 %x, 1
>   %c = call i1 @some_condition()
>   br i1 %c, label %true, label %false
> true:
>   %d = udiv i32 -1, %y
> false:
>   %z = phi i32 [ %true, %d ], [ %false, 0 ]
> 
> Is it safe to turn the phi into a select? (ignoring profitability)
> 
> KnownBits can tell us that %y always has at least one bit set, so it's never zero, so the udiv can never divide by zero, so it's safe to speculate, so the transform is safe. However, if %x is poison and %c is 0, the original program has no UB, and the transformed program does. It would then execute the udiv unconditionally, with a divisor of poison.
> 
> Inserting a freeze could fix this. However, where does the freeze go? %y is an input to the if-conversion pattern as SimpifyCFG knows it, however freezing %y doesn't help. Instead, the thing that needs to be frozen is %x, because that was the input to the KnownBits analysis that said it was safe. It would seem that either KnownBits would need to record the values it looked at, or there'd have to be a way to recompute the set on demand, so that these values could be frozen if needed. These might be manageable for KnownBits, though more complicated analyses (perhaps an Andersen's-style points-to system?) could be more awkward.

Right, that's a very good point. We already have this problem today as well, though.
We can translate this into a select by using freeze, as you say (that's exactly for these cases why we are proposing to introduce freeze).
What we need to settle on, and we didn't touch this on our proposal, is how LLVM's analyses API will look like. Most (if not all) LLVM analyses today report a result that doesn't exclude that the value might be poison.  That's because if you report %v to be non-null and it's actually poison it's fine since poison can be restricted to be non-null.  We will need to introduce a isNotPoison() analysis and a ensureIsNonPoison() utility (that needs to be used by all transformations doing speculation).
I would freeze %y. What's the argument against?

> Auto-vectorization, converting adjacent scalar memory accesses into vector accesses, or into wider combined scalar accesses is a concern. If widening a load causes it to read a poison that it didn't previously read, and if the value is subsequently stored, this would put poison in more individually addressable places than in the original program, so it could inject UB into a program that didn't previously have it. This is related to the memcpy topic that's been raised.

The problem with widening Is already there today. We are not really changing the semantics of poison, just creating more opportunities for it to appear.
We have a solution for widening, which is to use vectors. Since we define poison at value-level, we can widen, say, a "load i32" into "load <2 x i32>" (but not to "load i64").  One may argue that this representation is even better, since it actually exposes the fact that we are loading multiple, independent values.  But we need to make sure the rest of LLVM pipeline is fine with increased usage of vectors throughout.

> It's also related to the bitfield assignment topic. The solution where the first initialization writes a frozen poison over the adjacent bytes, thereby protecting subsequent assignments from unfrozen poison, is clever. However, it also raises some questions: Is the width at which any given bitfield field is initialized to be an ABI-exposed detail (all front-ends must emit the same width accesses for a given bitfield)? And, does this constrain the optimizer's ability to shrink such stores?

That's an interesting point; didn't think about it before.  You're right that all frontends manipulating a given bitfield would have to agree on the access size (at IR level; at assembly level it shouldn't matter).
Regarding store shrinking, I feel it's something better suited to SDAG/MI level.  There if we stick with undef, we can support shrinking.  If we introduce poison up to (and including) MI, then things become tricky.

Thanks,
Nuno