[llvm-dev] RFC: Killing undef and spreading poison

Fri Oct 21 16:18:15 PDT 2016

On 10/20/2016 4:05 PM, Michael Kuperstein wrote:
> As Sanjoy said, though, it should always be legal to optimize all uses
> of different "freeze(%x)" values to use the same def - this is
> equivalent to choosing the same value for all freezes. It's just not
> necessary to do so.

The problem is that once such choice occurs, it takes effect for all 
occurrences of of that value.  For example:

   %x = freeze(provably-poison)
   %y = freeze(provably-poison)
   ...

   if (%x == %y) {
     // lots of code
   }

   ...

   if (%x < %y) {
     // lots of code
   }

   ...

   if (2*%x < %y*(%y+1)) {
     // even more code
   }

If we want to fold the first comparison to save on code size, what 
effects does it have on the other comparisons?  To remain consistent, 
the compiler will need to pick concrete values for %x and %y and fold 
the other two comparisons according to what the corresponding conditions 
turn out to be.  If we are ambitious, we could solve the system of 
inequalities and pick values that eliminate all of the conditional code, 
but that isn't always feasible. Of course, we could fold the first one 
and then leave the other two alone, but after the first folding, we 
would have to know that we are no longer allowed to arbitrarily fold the 
remaining ones.

If the "strong UB" corresponds to cases that could cause the hardware to 
crash (unaligned load or indirect branch to an undefined address), then 
maybe we should have "unspecified behavior" as a counterpart of 
"unspecified value"? There is no reason to treat a conditional branch 
(if-then-else) on an unspecified condition in the same way as an 
indirect branch to an unspecified address. The freeze kind of sort of 
does that, but it carries the "unspecifiness" forward allowing it to 
accumulate into a complex networks of dependencies.

-Krzysztof