[PATCH] D143074: [LangRef] improve documentation of SNaN in the default FP environment

Mon Feb 6 07:48:51 PST 2023

jyknight added a comment.

In D143074#4105090 <https://reviews.llvm.org/D143074#4105090>, @RalfJung wrote:

>> we cannot allow LLVM to spuriously introduce sNaNs when the original code did not use any.
>
> That is one of the things being discussed here, right? Right now at least on old MIPS LLVM does *not* satisfy this requirement (apfloat will produce sNaN for that platform). The LangRef says "No floating-point exception state is maintained in this environment" which I read as "we don't care at all about the sNaN vs qNaN distinction and will just do whatever".

Then, LLVM is broken on old MIPS. There's just no way it's okay to spuriously introduce sNaNs when the original program didn't contain sNaNs in the first place. It results in incorrect results, without the original user code breaking any assumptions. (I have no idea if anyone still cares about floating-point on old MIPS or not, but I think it'll be up to someone who does to fix this, if there is someone who cares...)

>> As a general rule, when an operation gets an sNaN as input, it raises an invalid exception immediately. The default behavior upon is to set the invalid bit in the status flags AND to trigger an immediate return of a qNaN -- even when a qNaN input value would've resulted in some other output.
>
> I thought people were saying LLVM cares about none of that. Probably that was different people. ;)

Above I was describing the general rule per IEEE semantics. As it relates to LLVM, we do care to get the complete semantics correct in strictfp modes. In non-strictfp mode, we care only about a subset of the semantics -- and exactly what that subset contains is what we're trying to refine the definition of here.

> Right now it is the case that `pow(1.0 * sNaN, 0.0)` non-deterministically returns a qNaN or 1.0 (depending on whether the multiplication returned an sNaN or qNaN, and assuming I understood correctly that `pow` on an sNaN returns a qNaN). That also sounds pretty bad?

Correct, and that is why my recommendation is: "Floating-point math operations assume that all NaNs are quiet." If you violate that, by using an sNaN, you will get potentially unexpected results. (But I'm not sure if using sNaN has to be full-on UB/poison in order to explain the optimizations, or, can we make the constraint violation produce incorrect results which are bounded in some manner.)

> I would expect that LLVM pow treats sNaN like qNaN, because otherwise these sNaN-returning arithmetic operations could have rather surprising long-distance effects. If that is *not* the case then for sure just saying "anything may return an sNaN whenever it returns any NaN" does not work.

Yes, precisely, that's what I've been trying to say! But, LLVM cannot redefine pow.

> (I honestly find the behavior of `pow` that you describe extremely surprising. I would not expect an operation to have such wildly different behavior based on whether the input is an sNaN or qNaN.

The mental model you need is to remember that an sNaN triggers an immediate exception for any operation. The default exception handler will abort the operation, and return qNaN immediately. The actual operation doesn't even matter: the exception occurs, just from looking at the arguments. (If you have traps enabled, this seems like it could actually be useful. With traps disabled, I'm not sure there's really much point.)

qNaN is different: it doesn't invoke an immediate exception handler, instead it propagates through the operation. That typically results in a qNaN output, but not always. I believe the rationale for why `pow(qNaN, 0)` does NOT return qNaN is that every possible finite or infinite value which could be substituted there would produce the same answer: 1.0. Therefore, the result is fully defined, even though the input is not. (Just for clarity, this property is not true for multiplication by 0, because of `fmul 0, inf` not being 0.)

In some ways, I'd say qNaN operates similarly to LLVM-IR "undef" and sNaN similarly to LLVM-IR "poison" (But please don't try to take that analogy too far, it's certainly not fully accurate!)

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143074/new/

https://reviews.llvm.org/D143074