[PATCH] D28928: [ValueTracking] Implement SignBitMustBeZero correctly for sqrt.

Wed Jan 25 06:27:07 PST 2017

hfinkel added inline comments.

================
Comment at: llvm/lib/Analysis/ValueTracking.cpp:2673
+      return CI->hasNoNaNs() && (CI->hasNoSignedZeros() ||
+                                 CannotBeNegativeZero(CI->getOperand(0), TLI));
+
----------------
jlebar wrote:
> efriedma wrote:
> > You need to prove that the result isn't a NaN and isn't -0.  Using nsz to prove the result isn't -0 is suspicious.  For nsz, LangRef says "Allow optimizations to treat the sign of a zero argument or result as insignificant.", so the operand could in fact be -0.  Actually, sqrt(0) could produce -0 under that definition.
> > 
> > Really, what we want to do is propagate in the fast-math flags from caller; we want to transform nnan nsz fabs(sqrt(x)) -> sqrt(x); that transform depends on the fast-math flags of the fabs, not the fast-math flags of the sqrt().
> > 
> > I think the root of this mess is "SignBitOnly": it doesn't really capture what we care about.  We can split floating-point numbers into categories: +inf, -inf, +0, -0, positive finite, negative finite, negative nan, and positive nan.  CannotBeOrderedLessThanZero needs to prove that the input is not in the category -inf or the category negative finite.  SignBitMustBeZero needs to prove that the input is not in one of the categories -0, -inf, negative finite, or negative nan.  To transform nnan fabs(x) -> x, we need to prove that the input is not in one of the categories -0, -inf, or negative finite.  To transform nnan nsz fabs(x) -> x, we need to prove that the input is not in one of the categories -inf, or negative finite.  To transform fast fabs(x) -> x, we need to prove the input is not in the category negative finite.
> > 
> > Does this make sense, or is it completely crazy? :)
> > For nsz, LangRef says "Allow optimizations to treat the sign of a zero argument or result as insignificant.", so the operand could in fact be -0. Actually, sqrt(0) could produce -0 under that definition.
> 
> What I understand it means by "not significant" is that anywhere we have an nsz operation that produces or consumes a zero, we can assume that zero is positive or negative zero, at our convenience.
> 
> So I agree `nsz sqrt(+0)` could return `-0` under this definition, but I think the point is that we get to choose.
> 
> Reading it the opposite way -- that `nsz sqrt(+0)` might return `-0` and we have to account for this as a possibility -- would mean that `nsz` *limits* our optimization opportunities, which seems clearly not what it's intended to do.
> 
> > Really, what we want to do is propagate in the fast-math flags from caller; we want to transform nnan nsz fabs(sqrt(x)) -> sqrt(x); that transform depends on the fast-math flags of the fabs, not the fast-math flags of the sqrt().
> 
> I think this would be an additional correct optimization, but per above I don't think that what we have here is wrong.
> 
> > I think the root of this mess is "SignBitOnly": it doesn't really capture what we care about. [...] Does this make sense, or is it completely crazy? :)
> 
> Makes a ton of sense!  Just a few things to add, which we may already agree on:
> 
> - Per above, if the source of x is nsz or fast, we should be able to treat -0 and +0 as equivalent.  If we can prove that x is either -0 or +0, then we can behave as though x is in *both* the -0 and +0 categories.
>   
>   (Similarly, if the source of x is nnan or fast, we can assume that x is not +/- nan, and if the source of x is ninf or fast, we can assume that x is not +/- inf.)
> 
> - If x is produced by some ieee754 operation other than copy, negate, abs, and copySign, and may be nan, we don't know whether it's positive or negative nan (unless we know something about the target).
> Reading it the opposite way -- that nsz sqrt(+0) might return -0 and we have to account for this as a possibility -- would mean that nsz *limits* our optimization opportunities, which seems clearly not what it's intended to do.

I agree. If the user passes -fno-signed-zeros, then they're saying that we can ignore the possibility of signed zeros when determining optimization legality. They might still exist, and the fact that they still exist might still be observable, but the user is promising that they don't care that this might be observed. One key question here is: Are we allowed to observe signed zeros inconsistently? If x == -0, for example, can `if (x != -0) print(x);` print -0? I think the answer must be yes. This code cannot be sensibly compiled under -fno-signed-zeros.

https://reviews.llvm.org/D28928