[PATCH] D143074: [LangRef] improve documentation of SNaN in the default FP environment

Thu Feb 9 13:08:59 PST 2023

jyknight added a comment.

In D143074#4115297 <https://reviews.llvm.org/D143074#4115297>, @kpn wrote:

> How many of these sNaN vs qNaN cases that matter are there? I've seen pow() mentioned in this ticket. What are the other cases?

There's `pow`, `hypot`, `fmin`, and `fmax`: they all are expected to return a qNaN when passes an sNaN input, and will return a non-NaN value for qNaN input, depending on the other argument. I'm not 100% sure that's a complete list, but I think so.

In D143074#4114910 <https://reviews.llvm.org/D143074#4114910>, @RalfJung wrote:

> OTOH, even programmers that do not know about sNaN vs qNaN might be very surprised if `pow(x, 0.0)` can return NaN despite the docs saying it won't... so that cursed behavior of `pow` might be a forcing function for ensuring LLVM will never introduce new sNaN.

Yes. We do not want to break the semantics of a correct program which does not use any sNaNs. We support qNaN, so I don't see how `pow(expr-resulting-in-qNaN, 1.0) -> qNaN` (instead of 1.0) could be considered anything other than a miscompile -- unless the user has done something forbidden like using an sNaN in the computation of `expr-resulting-in-qNaN`.

> So the alternative is to say that if any input is an sNaN, then it may be treated as if it was a qNaN and output NaN have arbitrary quietness, but if there are no sNaN inputs then all output NaN will be quiet.

This sounds right. We may:

1. Treat an sNaN input value as if it had been a qNaN (and thus e.g. return a 1.0 instead of a qNaN from `pow 1.0, sNaN`), or
2. Pass an sNaN through an operation without quieting it first (and thus e.g. return a sNaN instead of qNaN from `fadd sNaN, 1.0`).

The part I am unsure of, is whether it is possible to put such a limited bound on the undefinedness. I don't feel like I have a good enough understanding/intuition of this sort of thing to do anything other than express a worry, so I hope someone else can clarify this aspect for me, and either reassure me that's an unfounded worry, or confirm it.

My worry is: Does having such an indeterminate output value, combined with other optimization passes, trigger unbounded UB from the system as-a-whole? E.g., because we can duplicate and coalesce FP math instructions, and make a different optimization decision for each duplicated instance separately, a single `fadd` with an sNaN input could appear to be a qNaN to some of its uses and an sNaN for others. Which then as discussed changes the results of finite values from FP computations too. Could that cause problems in downstream optimization passes?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143074/new/

https://reviews.llvm.org/D143074