[PATCH] D136176: Implement support for option 'fexcess-precision'.

Tue Nov 15 12:32:51 PST 2022

andrew.w.kaylor added a comment.

Can you update the clang user's manual to describe the behavior of this option? The excess precision issue is also mentioned in the clang language extensions document (https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point).

================
Comment at: clang/include/clang/Basic/LangOptions.def:320
 BENIGN_ENUM_LANGOPT(FPEvalMethod, FPEvalMethodKind, 2, FEM_UnsetOnCommandLine, "FP type used for floating point arithmetic")
+BENIGN_ENUM_LANGOPT(FPPrecisionMode, FPExcessPrecisionModeKind, 2, FPP_Standard, "FP precision used for floating point arithmetic")
 LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment")
----------------
A lot of people mix up precision and accuracy, so I think this description is likely to be ambiguous. I'd suggest "Intermediate truncation behavior for floating point arithmetic"

================
Comment at: clang/include/clang/Basic/LangOptions.h:300
+    FPP_Fast,
+    FPP_None
+  };
----------------
Is FPP_None somehow the same as fexcess-precision=16? If not, what does it mean?

================
Comment at: clang/include/clang/Driver/Options.td:1564
+def fexcess_precision_EQ : Joined<["-"], "fexcess-precision=">, Group<f_Group>, Flags<[CC1Option]>,
+  HelpText<"Specifies the precision in which this floating-point operations will be calculated.">,
+  Values<"standard,fast,16">, NormalizedValuesScope<"LangOptions">,
----------------
Again, I think this is ambiguous. My understanding is that the option is more about the rules for when intermediate values are truncated to source precision than when what precision is prescribed for the calculation. For targets that have native support for half precision types, the calculations should always be done at source precision and this option will have no effect. It's only when the native ISA doesn't support the source precision that this is needed and even then we will always perform calculations at the nearest precision to source precision that is available, right?

================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2995
+      StringRef Val = A->getValue();
+       if (TC.getTriple().getArch() == llvm::Triple::x86 && Val.equals("16"))
+        D.Diag(diag::err_drv_unsupported_opt_for_target)
----------------
Why is 16 only supported for x86? Is it only here for gcc compatibility?

================
Comment at: clang/test/CodeGen/X86/fexcess-precise.c:89
+// CHECK-EXT-NEXT:    [[EXT1:%.*]] = fpext half [[TMP1]] to float
+// CHECK-EXT-NEXT:    [[MUL:%.*]] = fmul float [[EXT]], [[EXT1]]
+// CHECK-EXT-NEXT:    [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
----------------
This is not what I'd expect to see. I'd expect the operations to use the half type with explicit truncation inserted where needed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136176/new/

https://reviews.llvm.org/D136176