[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]

Abe Skolnik via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 9 14:17:56 PDT 2016


Dear all,

In the process of investigating a performance difference between Clang & GCC when both compile 
the same non-toolchain program while using the "same"* compiler flags, I have found something 
that may be worth changing in Clang, developed a patch, and confirmed that the patch has its 
intended effect.

*: "same" in quotes b/c the essence of the problem is that the _meaning_ of "-O3" on Clang 
differs from that of "-O3" on GCC in at least one way.

The specific problem here relates to the default settings for FP contraction, e.g. fused 
multiply-add.  At -O2 and higher, GCC defaults FP contraction to "fast", i.e. always on.  I`m 
not suggesting that Clang/LLVM/both need to do the same, since Clang+LLVM has good support for 
"#pragma STDC FP_CONTRACT".

If we keep Clang`s default for FP contraction at "on" [which really means "according to the 
pragma"] but change the default value of the _pragma_ [currently off] to on at -O3, then Clang 
will be more competitive with GCC at high optimization settings without resorting to the 
more-brutish "fast by default" at plain -O3 [as opposed to "-Ofast", "-O3 -ffast-math", etc.].

Since I don`t know what Objective-C [and Objective-C++] have to say about FP operations, I have 
made my patch very selective based on language.  Also, I noticed that the CUDA front-end seems 
to already have its own defaults for FP contraction, so there`s no need to change this for 
every language.

I needed to change one test case because it made an assumption that FP contraction is off by 
default when compiling with "-O3" but without any additional optimization-related flags.





Patch relative to upstream code with Git ID b0768e805d1d33d730e5bd44ba578df043dfbc66
------------------------------------------------------------------------------------

diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp
index 619ea9c..d02d873 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -2437,6 +2437,13 @@ bool CompilerInvocation::CreateFromArgs(CompilerInvocation &Res,
    if (Arch == llvm::Triple::spir || Arch == llvm::Triple::spir64) {
      Res.getDiagnosticOpts().Warnings.push_back("spir-compat");
    }
+
+  // If there will ever be e.g. "LangOpts.C", replace "LangOpts.C11 || LangOpts.C99" with 
"LangOpts.C" on the next line.
+  if (    (LangOpts.C11 || LangOpts.C99 || LangOpts.CPlusPlus)                      // ...
+  /*...*/ && ( CodeGenOptions::FPC_On == Res.getCodeGenOpts().getFPContractMode() ) // ... // 
just being careful
+  /*...*/ && (Res.getCodeGenOpts().OptimizationLevel >= 3) )
+    LangOpts.DefaultFPContract = 1;
+
    return Success;
  }

diff --git a/clang/test/CodeGen/fp-contract-pragma.cpp b/clang/test/CodeGen/fp-contract-pragma.cpp
index 1c5921a..0949272 100644
--- a/clang/test/CodeGen/fp-contract-pragma.cpp
+++ b/clang/test/CodeGen/fp-contract-pragma.cpp
@@ -13,6 +13,7 @@ float fp_contract_2(float a, float b, float c) {
  // CHECK: _Z13fp_contract_2fff
  // CHECK: %[[M:.+]] = fmul float %a, %b
  // CHECK-NEXT: fadd float %[[M]], %c
+  #pragma STDC FP_CONTRACT OFF
    {
      #pragma STDC FP_CONTRACT ON
    }





Please give me any and all feedback you may have on this suggested change and this proposed patch.

Regards,

Abe


More information about the llvm-dev mailing list