[clang] [clang] Set correct FPOptions if attribute 'optnone' presents (PR #85605)

Wed May 1 17:48:43 PDT 2024

wjristow wrote:

Hi @spavloff.

Regarding:
> With that goal in mind, having `optnone` and `-O0` be deliberately different here makes no sense.

There's no need for them to behave differently here. And in fact, we _want_ them to behave the same.  There's a subtle point about FP contraction, in that doing an FP contraction (within a statement in C/C++) isn't considered an "optimization".  It's a bit counterintuitive, but there's a good reason.

A Fused Multiply Add (FMA) will, in general, produce a slightly different bit-wise floating point result than a multiply/add sequence.  For architectures that have FMA instructions, if we enabled FMA at (for example) `-O2` but not at lower optimizations, then our `-O0` and `-O2` FP results would generally not be bit-wise identical.  And we don't want that.  Essentially, we have a requirement that "safe" optimizations like `-O2` (as opposed to "unsafe" optimizations like `-ffast-math`) should not change the results of correct programs.  So if FMA is to be enabled at `-O2`, we must also enable it at `-O0`.

The legal values and semantics in C/C++ for `-ffp-contract=<value>` are:

```
-ffp-contract=off       Never use FMA
-ffp-contract=on        Use FMA within a statement (default)
-ffp-contract=fast      Use FMA across multiple statements
```

See: https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffp-contract

(There is also one other legal setting: `-ffp-contract=fast-honor-pragmas`, but it's not relevant to the issue here.)

So setting `-ffp-contract=off` isn't setting it to the `-O0` value, and setting it to the `-O0` value _is_ what we want.

To put it another way, for a target that has FMA instructions, a valid program will produce identical results for `-O0`, `-O1` and `-O2` (as desired), but with the change of this commit, applying the attribute `optnone` can now produce slightly different floating point results.

As you said earlier:

> The observed difference is due to the FP contraction turned off if optnone is specified. In O0 this optimization is still applied. As a result, the function with optnone contains separate fadd and fmul, while without this attribute the function contains combined operatin fmuladd.

In short, `optnone` shouldn't set `-ffp-contract=off`, it should set `-ffp-contract=on`.  Which, as I'm sure you realize, can be done via:

```

diff --git clang/include/clang/Basic/LangOptions.h clang/include/clang/Basic/LangOptions.h
index e2a2aa71b880..a5a7b3895d58 100644
--- clang/include/clang/Basic/LangOptions.h
+++ clang/include/clang/Basic/LangOptions.h
@@ -970,7 +970,7 @@ public:

   void setDisallowOptimizations() {
     setFPPreciseEnabled(true);
-    setDisallowFPContract();
+    setAllowFPContractWithinStatement();
   }

   storage_type getAsOpaqueInt() const {
```

https://github.com/llvm/llvm-project/pull/85605