[clang] [clang] Allow `ConditionalOperator` fast-math flags to be overridden by `pragma float_control` (PR #105912)

Andy Kaylor via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 8 13:07:59 PDT 2024


andykaylor wrote:

@rjmccall I understand your point, and I think you're raising a good question. Let's walk through an example that illustrates why we currently want FMF on phis and selects and see if we can agree on an alternative way to handle it. In a comment on https://github.com/llvm/llvm-project/issues/51601, I started with this:

```
double floatingAbs(double x) {
  return (x < 0) ? -x : x;
}

```
We want to optimize that to `llvm.fabs(x)`, but we can only do that if we don't care about the sign of zero. I walked through the steps of how that happens [here](https://github.com/llvm/llvm-project/issues/51601#issuecomment-981047527), but let me jump to the optimized IR just before the replacement happens because that's sufficient for the discussion about FMF on select instructions. The optimizer reduces the IR to this:

```
define dso_local double @floatingAbs(double %0) {
  %2 = fcmp fast olt double %0, 0.000000e+00
  %3 = fneg fast double %0
  %4 = select fast i1 %2, double %3, double %0
  ret double %4
}
```
We need the `nsz` flag to turn this into `llvm.fabs(x)`. The `nsz` flag is present on the `fcmp` instruction, but that doesn't matter because `0.0` compares as equal to `-0.0` with or without the flag. We also have `nsz` on the `fneg` instruction, but again that doesn't matter because we don't hit the `fneg` instruction for `0.0` or `-0.0`. We want this sequence to produce the absolute value of `x` but the only way we can say that it does is if we don't care about the sign of zero on the `select` instruction.

If this code is called with `x == -0.0`, the `select` instruction will return -0.0. We can only replace this with `llvm.fabs(x)` if we know we are allowed to ignore the sign of zero for the whole pattern. That leaves us two choices: either we depend on a function attribute saying that we can ignore the sign of zero for the entire function, or we must have the `nsz` flag set on all instructions involved in the pattern.

As I said before, relying on the function attribute gives us correct results, but because functions can have mixed fast-math states (through either inlining or pragmas), we may lose an optimization here. To me, it seems that having fast-math flags on `select` instruction is easiest way to do this without potentially losing optimizations, and that indirectly implies that we'd like to have FMF on phis and loads. Currently, we're in a mixed state in that regard where we accept the loss of optimization if a load is involved but preserve optimization through phis and selects.

However, as I think about this it occurs to me that there is another possibility. The argument about only applies to the `nsz` flag. The `nnan` and `ninf` flags can be deduced, and the rewrite flags don't have any meaning for phis, loads, and selects. Perhaps we shouldn't be looking at the `select` instruction but instead should be looking for the `nsz` flag on the uses of the select instruction. In the trivial case I cited above, the result of the select is being returned, so looking at the function attribute is correct. If the value selected were being used in the function, we would look at the uses. If all of the uses have the `nsz` flag set, This would complicate the handling a bit, but it certainly seems to make more sense in terms of the semantics of the `nsz` flag.

Tagging @jcranmer-intel who has been working on clarifying FMF semantics.

https://github.com/llvm/llvm-project/pull/105912


More information about the cfe-commits mailing list