[llvm-dev] RFC: Killing undef and spreading poison

Fri Oct 21 16:42:50 PDT 2016

Hi Krzysztof,

Krzysztof Parzyszek wrote:
 > On 10/20/2016 4:05 PM, Michael Kuperstein wrote:
 >> As Sanjoy said, though, it should always be legal to optimize all uses
 >> of different "freeze(%x)" values to use the same def - this is
 >> equivalent to choosing the same value for all freezes. It's just not
 >> necessary to do so.
 >
 > The problem is that once such choice occurs, it takes effect for all
 > occurrences of of that value. For example:
 >
 > %x = freeze(provably-poison)
 > %y = freeze(provably-poison)
 > ...
 >
 > if (%x == %y) {
 > // lots of code
 > }
 >
 > ...
 >
 > if (%x < %y) {
 > // lots of code
 > }
 >
 > ...
 >
 > if (2*%x < %y*(%y+1)) {
 > // even more code
 > }

I don't really see the complexity here.  If you want to fold "%x ==
%y" to true, then you basically have to %y->rauw(%x).  This could lead
to lost opportunities down the line, but that's already the case with
transforms we do today e.g.

   if (x == y)
     f(x)

=>

   if (x == y)
     f(y)

The former would have been better for optimization if x is undef and y
isn't.  We just take a local best guess and go ahead with it.

However, there are cases where the consistency requirement will
prevent us from simplifying away freezes:

   %t = freeze <poison>
   store volatile %t to loc
   %x = load volatile other_loc
   %xor = xor %x, %t

We can't fold %xor to 0 because that would require us to %t->RAUW(%x)
which would break SSA.

 > If we want to fold the first comparison to save on code size, what
 > effects does it have on the other comparisons? To remain consistent, the
 > compiler will need to pick concrete values for %x and %y and fold the

Yes, it needs to take steps to avoid inconsistency down the line.

 > other two comparisons according to what the corresponding conditions
 > turn out to be. If we are ambitious, we could solve the system of
 > inequalities and pick values that eliminate all of the conditional code,
 > but that isn't always feasible. Of course, we could fold the first one
 > and then leave the other two alone, but after the first folding, we
 > would have to know that we are no longer allowed to arbitrarily fold the
 > remaining ones.

Agreed.

 > If the "strong UB" corresponds to cases that could cause the hardware to
 > crash (unaligned load or indirect branch to an undefined address), then
 > maybe we should have "unspecified behavior" as a counterpart of
 > "unspecified value"? There is no reason to treat a conditional branch
 > (if-then-else) on an unspecified condition in the same way as an
 > indirect branch to an unspecified address. The freeze kind of sort of

However, we need to justify control flow based correlated value
analysis.  That is:

   if (x != 0) {
     if (...)
       1/x; // safe to speculate
   }

 > does that, but it carries the "unspecifiness" forward allowing it to
 > accumulate into a complex networks of dependencies.

-- Sanjoy