[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]
Abe Skolnik via llvm-dev
llvm-dev at lists.llvm.org
Fri Sep 9 14:17:56 PDT 2016
Dear all,
In the process of investigating a performance difference between Clang & GCC when both compile
the same non-toolchain program while using the "same"* compiler flags, I have found something
that may be worth changing in Clang, developed a patch, and confirmed that the patch has its
intended effect.
*: "same" in quotes b/c the essence of the problem is that the _meaning_ of "-O3" on Clang
differs from that of "-O3" on GCC in at least one way.
The specific problem here relates to the default settings for FP contraction, e.g. fused
multiply-add. At -O2 and higher, GCC defaults FP contraction to "fast", i.e. always on. I`m
not suggesting that Clang/LLVM/both need to do the same, since Clang+LLVM has good support for
"#pragma STDC FP_CONTRACT".
If we keep Clang`s default for FP contraction at "on" [which really means "according to the
pragma"] but change the default value of the _pragma_ [currently off] to on at -O3, then Clang
will be more competitive with GCC at high optimization settings without resorting to the
more-brutish "fast by default" at plain -O3 [as opposed to "-Ofast", "-O3 -ffast-math", etc.].
Since I don`t know what Objective-C [and Objective-C++] have to say about FP operations, I have
made my patch very selective based on language. Also, I noticed that the CUDA front-end seems
to already have its own defaults for FP contraction, so there`s no need to change this for
every language.
I needed to change one test case because it made an assumption that FP contraction is off by
default when compiling with "-O3" but without any additional optimization-related flags.
Patch relative to upstream code with Git ID b0768e805d1d33d730e5bd44ba578df043dfbc66
------------------------------------------------------------------------------------
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp
index 619ea9c..d02d873 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -2437,6 +2437,13 @@ bool CompilerInvocation::CreateFromArgs(CompilerInvocation &Res,
if (Arch == llvm::Triple::spir || Arch == llvm::Triple::spir64) {
Res.getDiagnosticOpts().Warnings.push_back("spir-compat");
}
+
+ // If there will ever be e.g. "LangOpts.C", replace "LangOpts.C11 || LangOpts.C99" with
"LangOpts.C" on the next line.
+ if ( (LangOpts.C11 || LangOpts.C99 || LangOpts.CPlusPlus) // ...
+ /*...*/ && ( CodeGenOptions::FPC_On == Res.getCodeGenOpts().getFPContractMode() ) // ... //
just being careful
+ /*...*/ && (Res.getCodeGenOpts().OptimizationLevel >= 3) )
+ LangOpts.DefaultFPContract = 1;
+
return Success;
}
diff --git a/clang/test/CodeGen/fp-contract-pragma.cpp b/clang/test/CodeGen/fp-contract-pragma.cpp
index 1c5921a..0949272 100644
--- a/clang/test/CodeGen/fp-contract-pragma.cpp
+++ b/clang/test/CodeGen/fp-contract-pragma.cpp
@@ -13,6 +13,7 @@ float fp_contract_2(float a, float b, float c) {
// CHECK: _Z13fp_contract_2fff
// CHECK: %[[M:.+]] = fmul float %a, %b
// CHECK-NEXT: fadd float %[[M]], %c
+ #pragma STDC FP_CONTRACT OFF
{
#pragma STDC FP_CONTRACT ON
}
Please give me any and all feedback you may have on this suggested change and this proposed patch.
Regards,
Abe
More information about the llvm-dev
mailing list