[clang] [clang] Set correct FPOptions if attribute 'optnone' presents (PR #85605)
via cfe-commits
cfe-commits at lists.llvm.org
Wed May 1 17:48:43 PDT 2024
wjristow wrote:
Hi @spavloff.
Regarding:
> With that goal in mind, having `optnone` and `-O0` be deliberately different here makes no sense.
There's no need for them to behave differently here. And in fact, we _want_ them to behave the same. There's a subtle point about FP contraction, in that doing an FP contraction (within a statement in C/C++) isn't considered an "optimization". It's a bit counterintuitive, but there's a good reason.
A Fused Multiply Add (FMA) will, in general, produce a slightly different bit-wise floating point result than a multiply/add sequence. For architectures that have FMA instructions, if we enabled FMA at (for example) `-O2` but not at lower optimizations, then our `-O0` and `-O2` FP results would generally not be bit-wise identical. And we don't want that. Essentially, we have a requirement that "safe" optimizations like `-O2` (as opposed to "unsafe" optimizations like `-ffast-math`) should not change the results of correct programs. So if FMA is to be enabled at `-O2`, we must also enable it at `-O0`.
The legal values and semantics in C/C++ for `-ffp-contract=<value>` are:
```
-ffp-contract=off Never use FMA
-ffp-contract=on Use FMA within a statement (default)
-ffp-contract=fast Use FMA across multiple statements
```
See: https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffp-contract
(There is also one other legal setting: `-ffp-contract=fast-honor-pragmas`, but it's not relevant to the issue here.)
So setting `-ffp-contract=off` isn't setting it to the `-O0` value, and setting it to the `-O0` value _is_ what we want.
To put it another way, for a target that has FMA instructions, a valid program will produce identical results for `-O0`, `-O1` and `-O2` (as desired), but with the change of this commit, applying the attribute `optnone` can now produce slightly different floating point results.
As you said earlier:
> The observed difference is due to the FP contraction turned off if optnone is specified. In O0 this optimization is still applied. As a result, the function with optnone contains separate fadd and fmul, while without this attribute the function contains combined operatin fmuladd.
In short, `optnone` shouldn't set `-ffp-contract=off`, it should set `-ffp-contract=on`. Which, as I'm sure you realize, can be done via:
```
diff --git clang/include/clang/Basic/LangOptions.h clang/include/clang/Basic/LangOptions.h
index e2a2aa71b880..a5a7b3895d58 100644
--- clang/include/clang/Basic/LangOptions.h
+++ clang/include/clang/Basic/LangOptions.h
@@ -970,7 +970,7 @@ public:
void setDisallowOptimizations() {
setFPPreciseEnabled(true);
- setDisallowFPContract();
+ setAllowFPContractWithinStatement();
}
storage_type getAsOpaqueInt() const {
```
https://github.com/llvm/llvm-project/pull/85605
More information about the cfe-commits
mailing list