[llvm] [LangRef] Clarify specification for float min/max operations (PR #172012)
via llvm-commits
llvm-commits at lists.llvm.org
Sun Dec 21 20:11:03 PST 2025
valadaptive wrote:
> I'm talking about this point:
>
> > Matt has some use case for this in mind in math library optimization (if I understood right, related to the sign bit of the result of maxnum(x, 0.0), which may be set if x == -0.0 without the ordered zero semantics).
I certainly agree that we shouldn't do such optimizations until the signed-zero behavior is actually implemented, and that they should also not be applied if the `nsz` flag is passed.
I don't see why that means we shouldn't expose signed-zero ordering *at all*.
Matt's proposed optimization is actually illustrative here. I don't know in particular what we can optimize by statically knowing the sign bit, but it applies when one operand is statically known to be +0.0.
In that case, having the nsz flag is actually a good thing. Recall that the x86 implementation uses `maxps`, which behaves like `y < x ? x : y`. Other fallback implementations, for architectures without a floating-point min/max operation, will do the same thing but with an explicit compare+select.
In such a scenario, we can actually *refine* `maxnum nsz (x, 0.0)` to `maxnum(x, 0.0)` without any performance loss at all. The operation lowers to `0.0 < x ? x : 0.0`, and it's easy to see that the result is always positive, so we don't need to fix up the sign bit afterwards. The x86 backend already performs this optimization.
Consider instead a scenario where `maxnum` had no signed-zero ordering semantics. If we wanted to statically know the sign bit of `maxnum(x, 0.0)`, we'd have to refine it to `maximumnum(x, 0.0)`. `x` could be a signaling NaN, so on platforms that implement the IEEE754-2008 operations, like AArch64, we would have to "canonicalize" `x` beforehand. This is an unnecessary performance cost, since the original `maxnum` *doesn't* require us to handle signaling NaN.
Basically, if we have a version of `maxnum` that orders signed zeroes properly, we can do *more* optimizations. Even if the frontends choose not to emit it and always add the `nsz` flag, we can safely drop `nsz` if we know it helps with performance.
https://github.com/llvm/llvm-project/pull/172012
More information about the llvm-commits
mailing list