[clang] Honor pragmas with -ffp-contract=fast, depecate fast-honor-pragmas (PR #105746)
Andy Kaylor via cfe-commits
cfe-commits at lists.llvm.org
Tue Aug 27 18:06:10 PDT 2024
https://github.com/andykaylor updated https://github.com/llvm/llvm-project/pull/105746
>From 973e7fb1a09f5479f88616e19642a71edc977b4a Mon Sep 17 00:00:00 2001
From: Andy Kaylor <andrew.kaylor at intel.com>
Date: Thu, 22 Aug 2024 15:35:07 -0700
Subject: [PATCH 1/2] Honor pragmas with -ffp-contract=fast, depecate
fast-honor-pragmas
Change the handling of -ffp-contract=fast such that the contract
semantics described in the IR are always respected. The front end was
already generating IR that enabled or disabled contraction as requested
by relevant pragmas, but it was setting a TargetOption (AllowFPOpFusion)
that causes the LLVM code generator to disregard the IR restrictions.
This makes -ffp-contract=fast-honor-pragmas redundant, so it is now
reported as deprecated.
For more discussion, see: https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797
---
clang/docs/LanguageExtensions.rst | 4 +--
clang/docs/ReleaseNotes.rst | 4 +++
clang/docs/UsersManual.rst | 6 ++---
clang/include/clang/Basic/LangOptions.h | 10 +------
clang/include/clang/Driver/Options.td | 6 ++---
clang/lib/Basic/LangOptions.cpp | 14 ++--------
clang/lib/CodeGen/BackendUtil.cpp | 20 +++-----------
clang/lib/CodeGen/CGCall.cpp | 4 +--
clang/lib/Driver/ToolChains/Clang.cpp | 7 +++++
clang/lib/Frontend/CompilerInvocation.cpp | 6 ++---
clang/lib/Sema/SemaAttr.cpp | 1 -
.../ffp-contract-fast-honor-pramga-option.cpp | 2 +-
.../ffp-contract-fhp-pragma-override.cpp | 2 +-
clang/test/CodeGen/fp-function-attrs.cpp | 15 ++++++++++-
clang/test/CodeGenCUDA/fp-contract.cu | 27 ++-----------------
clang/test/Driver/fp-contract.c | 7 ++---
16 files changed, 49 insertions(+), 86 deletions(-)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index 62903fc3744cad..cef239346f58c5 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -5037,9 +5037,7 @@ statements in C).
The pragma can also be used with ``off`` which turns FP contraction off for a
section of the code. This can be useful when fast contraction is otherwise
-enabled for the translation unit with the ``-ffp-contract=fast-honor-pragmas`` flag.
-Note that ``-ffp-contract=fast`` will override pragmas to fuse multiply and
-addition across statements regardless of any controlling pragmas.
+enabled for the translation unit with the ``-ffp-contract=fast`` flag.
``#pragma clang fp exceptions`` specifies floating point exception behavior. It
may take one of the values: ``ignore``, ``maytrap`` or ``strict``. Meaning of
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 12a924acc14331..2ab1938b12f006 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -190,6 +190,10 @@ Modified Compiler Flags
the ``promoted`` algorithm for complex division when possible rather than the
less basic (limited range) algorithm.
+- The ``-ffp-contract`` option now honors pragmas by default when the ``fast``
+ argument is used. The ``fast-honor-pragmas`` option is now deprecated and acts
+ as an alias for ``fast``.
+
Removed Compiler Flags
-------------------------
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index d19b77ae40b0d7..0452cc0f774311 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -1444,7 +1444,7 @@ describes the various floating point semantic modes and the corresponding option
"ffp-exception-behavior", "{ignore, strict, maytrap}",
"fenv_access", "{off, on}", "(none)"
"frounding-math", "{dynamic, tonearest, downward, upward, towardzero}"
- "ffp-contract", "{on, off, fast, fast-honor-pragmas}"
+ "ffp-contract", "{on, off, fast}"
"fdenormal-fp-math", "{IEEE, PreserveSign, PositiveZero}"
"fdenormal-fp-math-fp32", "{IEEE, PreserveSign, PositiveZero}"
"fmath-errno", "{on, off}"
@@ -1646,10 +1646,10 @@ for more details.
Valid values are:
- * ``fast`` (fuse across statements disregarding pragmas, default for CUDA)
+ * ``fast`` (fuse across statements unless dictated by pragmas, default for CUDA/HIP)
* ``on`` (fuse in the same statement unless dictated by pragmas, default for languages other than CUDA/HIP)
* ``off`` (never fuse)
- * ``fast-honor-pragmas`` (fuse across statements unless dictated by pragmas, default for HIP)
+ * ``fast-honor-pragmas`` (deprecated, aliases fast)
.. option:: -f[no-]honor-infinities
diff --git a/clang/include/clang/Basic/LangOptions.h b/clang/include/clang/Basic/LangOptions.h
index eb4cb4b5a7e93f..2fcc211e8249d7 100644
--- a/clang/include/clang/Basic/LangOptions.h
+++ b/clang/include/clang/Basic/LangOptions.h
@@ -265,11 +265,8 @@ class LangOptionsBase {
// Enable the floating point pragma
FPM_On,
- // Aggressively fuse FP ops (E.g. FMA) disregarding pragmas.
- FPM_Fast,
-
// Aggressively fuse FP ops and honor pragmas.
- FPM_FastHonorPragmas
+ FPM_Fast
};
/// Possible floating point exception behavior.
@@ -831,12 +828,7 @@ class FPOptions {
}
explicit FPOptions(const LangOptions &LO) {
Value = 0;
- // The language fp contract option FPM_FastHonorPragmas has the same effect
- // as FPM_Fast in frontend. For simplicity, use FPM_Fast uniformly in
- // frontend.
auto LangOptContractMode = LO.getDefaultFPContractMode();
- if (LangOptContractMode == LangOptions::FPM_FastHonorPragmas)
- LangOptContractMode = LangOptions::FPM_Fast;
setFPContractMode(LangOptContractMode);
setRoundingMath(LO.RoundingMath);
setConstRoundingMode(LangOptions::RoundingMode::Dynamic);
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 111608d30ff827..c50e458c06ab1b 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2658,11 +2658,11 @@ def fno_trapping_math : Flag<["-"], "fno-trapping-math">, Group<f_Group>;
def ffp_contract : Joined<["-"], "ffp-contract=">, Group<f_Group>,
Visibility<[ClangOption, CC1Option, FC1Option, FlangOption]>,
DocBrief<"Form fused FP ops (e.g. FMAs):"
- " fast (fuses across statements disregarding pragmas)"
+ " fast (fuses across statements unless dictated by pragmas)"
" | on (only fuses in the same statement unless dictated by pragmas)"
" | off (never fuses)"
- " | fast-honor-pragmas (fuses across statements unless dictated by pragmas)."
- " Default is 'fast' for CUDA, 'fast-honor-pragmas' for HIP, and 'on' otherwise.">,
+ " | fast-honor-pragmas (deprecated, aliases fast)."
+ " Default is 'fast' for CUDA or HIP, and 'on' otherwise.">,
HelpText<"Form fused FP ops (e.g. FMAs)">,
Values<"fast,on,off,fast-honor-pragmas">;
diff --git a/clang/lib/Basic/LangOptions.cpp b/clang/lib/Basic/LangOptions.cpp
index 9331a63d91b173..b8c35b94130f2a 100644
--- a/clang/lib/Basic/LangOptions.cpp
+++ b/clang/lib/Basic/LangOptions.cpp
@@ -182,18 +182,8 @@ void LangOptions::setLangDefaults(LangOptions &Opts, Language Lang,
Opts.HIP = Lang == Language::HIP;
Opts.CUDA = Lang == Language::CUDA || Opts.HIP;
- if (Opts.HIP) {
- // HIP toolchain does not support 'Fast' FPOpFusion in backends since it
- // fuses multiplication/addition instructions without contract flag from
- // device library functions in LLVM bitcode, which causes accuracy loss in
- // certain math functions, e.g. tan(-1e20) becomes -0.933 instead of 0.8446.
- // For device library functions in bitcode to work, 'Strict' or 'Standard'
- // FPOpFusion options in backends is needed. Therefore 'fast-honor-pragmas'
- // FP contract option is used to allow fuse across statements in frontend
- // whereas respecting contract flag in backend.
- Opts.setDefaultFPContractMode(LangOptions::FPM_FastHonorPragmas);
- } else if (Opts.CUDA) {
- if (T.isSPIRV()) {
+ if (Opts.HIP || Opts.CUDA) {
+ if (Opts.CUDA && T.isSPIRV()) {
// Emit OpenCL version metadata in LLVM IR when targeting SPIR-V.
Opts.OpenCLVersion = 200;
}
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index fdd89edd72e109..e880347aed9f5e 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -348,20 +348,8 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
.Default(llvm::FloatABI::Default);
// Set FP fusion mode.
- switch (LangOpts.getDefaultFPContractMode()) {
- case LangOptions::FPM_Off:
- // Preserve any contraction performed by the front-end. (Strict performs
- // splitting of the muladd intrinsic in the backend.)
- Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
- break;
- case LangOptions::FPM_On:
- case LangOptions::FPM_FastHonorPragmas:
- Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
- break;
- case LangOptions::FPM_Fast:
- Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
- break;
- }
+ // All allowed fusion is indicated in the IR.
+ Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
Options.BinutilsVersion =
llvm::TargetMachine::parseBinutilsVersion(CodeGenOpts.BinutilsVersion);
@@ -386,9 +374,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
Options.UnsafeFPMath = LangOpts.AllowFPReassoc && LangOpts.AllowRecip &&
LangOpts.NoSignedZero && LangOpts.ApproxFunc &&
(LangOpts.getDefaultFPContractMode() ==
- LangOptions::FPModeKind::FPM_Fast ||
- LangOpts.getDefaultFPContractMode() ==
- LangOptions::FPModeKind::FPM_FastHonorPragmas);
+ LangOptions::FPModeKind::FPM_Fast);
Options.ApproxFuncFPMath = LangOpts.ApproxFunc;
Options.BBAddrMap = CodeGenOpts.BBAddrMap;
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index ca2c79b51ac96b..06e633f88c6001 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1919,9 +1919,7 @@ static void getTrivialDefaultFunctionAttributes(
if (LangOpts.AllowFPReassoc && LangOpts.AllowRecip &&
LangOpts.NoSignedZero && LangOpts.ApproxFunc &&
(LangOpts.getDefaultFPContractMode() ==
- LangOptions::FPModeKind::FPM_Fast ||
- LangOpts.getDefaultFPContractMode() ==
- LangOptions::FPModeKind::FPM_FastHonorPragmas))
+ LangOptions::FPModeKind::FPM_Fast))
FuncAttrs.addAttribute("unsafe-fp-math", "true");
if (CodeGenOpts.SoftFloat)
FuncAttrs.addAttribute("use-soft-float", "true");
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 9f1d57f43b6565..cb63f0083884e0 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -3178,6 +3178,13 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
StringRef Val = A->getValue();
if (Val == "fast" || Val == "on" || Val == "off" ||
Val == "fast-honor-pragmas") {
+ // fast-honor-pragmas is deprecated -- replace it with fast
+ if (Val == "fast-honor-pragmas") {
+ D.Diag(diag::warn_drv_deprecated_arg)
+ << A->getAsString(Args) << /*hasReplacement=*/true
+ << "-ffp-contract=fast";
+ Val = "fast";
+ }
if (Val != FPContract && LastFpContractOverrideOption != "") {
D.Diag(clang::diag::warn_drv_overriding_option)
<< LastFpContractOverrideOption
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp
index f510d3067d4d58..32ccfb082ee5af 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3760,8 +3760,6 @@ void CompilerInvocationBase::GenerateLangArgs(const LangOptions &Opts,
GenerateArg(Consumer, OPT_ffp_contract, "on");
else if (Opts.DefaultFPContractMode == LangOptions::FPM_Off)
GenerateArg(Consumer, OPT_ffp_contract, "off");
- else if (Opts.DefaultFPContractMode == LangOptions::FPM_FastHonorPragmas)
- GenerateArg(Consumer, OPT_ffp_contract, "fast-honor-pragmas");
for (StringRef Sanitizer : serializeSanitizerKinds(Opts.Sanitize))
GenerateArg(Consumer, OPT_fsanitize_EQ, Sanitizer);
@@ -4261,8 +4259,8 @@ bool CompilerInvocation::ParseLangArgs(LangOptions &Opts, ArgList &Args,
Opts.setDefaultFPContractMode(LangOptions::FPM_On);
else if (Val == "off")
Opts.setDefaultFPContractMode(LangOptions::FPM_Off);
- else if (Val == "fast-honor-pragmas")
- Opts.setDefaultFPContractMode(LangOptions::FPM_FastHonorPragmas);
+ else if (Val == "fast-honor-pragmas") // Deprecated
+ Opts.setDefaultFPContractMode(LangOptions::FPM_Fast);
else
Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val;
}
diff --git a/clang/lib/Sema/SemaAttr.cpp b/clang/lib/Sema/SemaAttr.cpp
index a1724820472b59..70783a7727beb6 100644
--- a/clang/lib/Sema/SemaAttr.cpp
+++ b/clang/lib/Sema/SemaAttr.cpp
@@ -1269,7 +1269,6 @@ void Sema::ActOnPragmaFPContract(SourceLocation Loc,
NewFPFeatures.setAllowFPContractWithinStatement();
break;
case LangOptions::FPM_Fast:
- case LangOptions::FPM_FastHonorPragmas:
NewFPFeatures.setAllowFPContractAcrossStatement();
break;
case LangOptions::FPM_Off:
diff --git a/clang/test/CodeGen/ffp-contract-fast-honor-pramga-option.cpp b/clang/test/CodeGen/ffp-contract-fast-honor-pramga-option.cpp
index fef4da1edf1fc9..c3d8909c33bd69 100644
--- a/clang/test/CodeGen/ffp-contract-fast-honor-pramga-option.cpp
+++ b/clang/test/CodeGen/ffp-contract-fast-honor-pramga-option.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -O3 -ffp-contract=fast-honor-pragmas -triple %itanium_abi_triple -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -O3 -ffp-contract=fast -triple %itanium_abi_triple -emit-llvm -o - %s | FileCheck %s
float fp_contract_1(float a, float b, float c) {
// CHECK-LABEL: fp_contract_1fff(
diff --git a/clang/test/CodeGen/ffp-contract-fhp-pragma-override.cpp b/clang/test/CodeGen/ffp-contract-fhp-pragma-override.cpp
index ff35c9204c79cd..fe4cf21861f004 100644
--- a/clang/test/CodeGen/ffp-contract-fhp-pragma-override.cpp
+++ b/clang/test/CodeGen/ffp-contract-fhp-pragma-override.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -O3 -ffp-contract=fast-honor-pragmas -triple %itanium_abi_triple -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -O3 -ffp-contract=fast -triple %itanium_abi_triple -emit-llvm -o - %s | FileCheck %s
float fp_contract_on_1(float a, float b, float c) {
// CHECK-LABEL: fp_contract_on_1fff(
diff --git a/clang/test/CodeGen/fp-function-attrs.cpp b/clang/test/CodeGen/fp-function-attrs.cpp
index e92c26cdb75f7f..eb0801dd6bea61 100644
--- a/clang/test/CodeGen/fp-function-attrs.cpp
+++ b/clang/test/CodeGen/fp-function-attrs.cpp
@@ -1,5 +1,4 @@
// RUN: %clang_cc1 -triple x86_64-linux-gnu -ffast-math -ffp-contract=fast -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple x86_64-linux-gnu -ffast-math -ffp-contract=fast-honor-pragmas -emit-llvm -o - %s | FileCheck %s
float test_default(float a, float b, float c) {
float tmp = a;
@@ -53,6 +52,20 @@ float test_contract_on_pragma(float a, float b, float c) {
// CHECK: fmul fast float {{%.+}}, {{%.+}}
// CHECK: fadd reassoc nnan ninf nsz arcp afn float {{%.+}}, {{%.+}}
+float test_contract_off_pragma(float a, float b, float c) {
+ float tmp = a * b;
+ {
+ #pragma clang fp contract(off)
+ tmp += c;
+ }
+ return tmp;
+}
+
+// CHECK: define{{.*}} float @_Z24test_contract_off_pragmafff(float noundef nofpclass(nan inf) %a, float noundef nofpclass(nan inf) %b, float noundef nofpclass(nan inf) %c) [[NO_UNSAFE_ATTRS:#[0-9]+]]
+// CHECK: fmul fast float {{%.+}}, {{%.+}}
+// CHECK: fadd reassoc nnan ninf nsz arcp afn float {{%.+}}, {{%.+}}
+
+
// CHECK: attributes [[FAST_ATTRS]] = { {{.*}}"no-infs-fp-math"="true" {{.*}}"no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" {{.*}}"unsafe-fp-math"="true"{{.*}} }
// CHECK: attributes [[PRECISE_ATTRS]] = { {{.*}}"no-infs-fp-math"="false" {{.*}}"no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" {{.*}}"unsafe-fp-math"="false"{{.*}} }
// CHECK: attributes [[NO_UNSAFE_ATTRS]] = { {{.*}}"no-infs-fp-math"="true" {{.*}}"no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" {{.*}}"unsafe-fp-math"="false"{{.*}} }
diff --git a/clang/test/CodeGenCUDA/fp-contract.cu b/clang/test/CodeGenCUDA/fp-contract.cu
index 60824ba59ddfb0..58558f956d06de 100644
--- a/clang/test/CodeGenCUDA/fp-contract.cu
+++ b/clang/test/CodeGenCUDA/fp-contract.cu
@@ -1,6 +1,6 @@
// REQUIRES: x86-registered-target, nvptx-registered-target, amdgpu-registered-target
-// By default CUDA uses -ffp-contract=fast, HIP uses -ffp-contract=fast-honor-pragmas.
+// By default CUDA and HIP use -ffp-contract=fast.
// we should fuse multiply/add into fma instruction.
// In IR, fmul/fadd instructions with contract flag are emitted.
// In backend
@@ -68,35 +68,12 @@
// RUN: -O3 -target-cpu gfx906 -o - -x ir %t.ll \
// RUN: | FileCheck -check-prefixes=COMMON,AMD-OPT-FASTSTD %s
-// Explicit -ffp-contract=fast-honor-pragmas
-// In IR, fmul/fadd instructions with contract flag are emitted.
-// In backend
-// nvptx/amdgcn - assumes standard fp fuse option, which only
-// fuses mult/add insts with contract flag or
-// llvm.fmuladd intrinsics.
-
-// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
-// RUN: -ffp-contract=fast-honor-pragmas -disable-llvm-passes -o - %s \
-// RUN: | FileCheck -check-prefixes=COMMON,NV-ON %s
-// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -S \
-// RUN: -target-cpu gfx906 -disable-llvm-passes -o - -x hip %s \
-// RUN: -ffp-contract=fast-honor-pragmas \
-// RUN: | FileCheck -check-prefixes=COMMON,AMD-ON %s
-// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
-// RUN: -O3 -o - %s \
-// RUN: -ffp-contract=fast-honor-pragmas \
-// RUN: | FileCheck -check-prefixes=COMMON,NV-OPT-FASTSTD %s
-// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -S \
-// RUN: -O3 -target-cpu gfx906 -o - -x hip %s \
-// RUN: -ffp-contract=fast-honor-pragmas \
-// RUN: | FileCheck -check-prefixes=COMMON,AMD-OPT-FASTSTD %s
-
// Check separate compile/backend steps corresponding to -save-temps.
// When input is IR, -ffp-contract has no effect. Backend uses default
// default FP fuse option.
// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
-// RUN: -ffp-contract=fast-honor-pragmas \
+// RUN: -ffp-contract=fast \
// RUN: -O3 -disable-llvm-passes -target-cpu gfx906 -o %t.ll -x hip %s
// RUN: cat %t.ll | FileCheck -check-prefixes=COMMON,AMD-OPT-FAST-IR %s
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -S \
diff --git a/clang/test/Driver/fp-contract.c b/clang/test/Driver/fp-contract.c
index cab63683ee8132..06241caa2fbd5f 100644
--- a/clang/test/Driver/fp-contract.c
+++ b/clang/test/Driver/fp-contract.c
@@ -6,7 +6,6 @@
// before the drive options that are checked below the run lines.
// WARN_FM_OFF: warning: overriding '-ffast-math' option with '-ffp-contract=off'
// WARN_FM_ON: warning: overriding '-ffast-math' option with '-ffp-contract=on'
-// WARN_FM_FHP: warning: overriding '-ffast-math' option with '-ffp-contract=fast-honor-pragmas'
// WARN_UM_OFF: warning: overriding '-funsafe-math-optimizations' option with '-ffp-contract=off'
// WARN_UM_ON: warning: overriding '-funsafe-math-optimizations' option with '-ffp-contract=on'
@@ -30,8 +29,10 @@
// RUN: | FileCheck --check-prefix=CHECK-FPC-FAST %s
// RUN: %clang -### -ffast-math -ffp-contract=fast-honor-pragmas -c %s 2>&1 \
-// RUN: | FileCheck --check-prefixes=CHECK-FPC-FAST-HONOR,WARN_FM_FHP %s
-// CHECK-FPC-FAST-HONOR: "-ffp-contract=fast-honor-pragmas"
+// RUN: | FileCheck --check-prefixes=CHECK-FPC-FAST-HONOR,WARN_FHP_DEPRECATED %s
+// WARN_FHP_DEPRECATED: clang: warning: argument '-ffp-contract=fast-honor-pragmas' is deprecated, use '-ffp-contract=fast' instead [-Wdeprecated]
+// CHECK-FPC-FAST-HONOR: "-ffp-contract=fast"
+// CHECK-FPC-FAST-HONOR-NOT: "-honor-pragmas"
// RUN: %clang -### -Werror -ffp-contract=fast -ffast-math -c %s 2>&1 \
// RUN: | FileCheck --check-prefix=CHECK-FPC-FAST %s
>From 9ad8c16ae36a8a82a9d34fe90dd81e7fe2dffa82 Mon Sep 17 00:00:00 2001
From: Andy Kaylor <andrew.kaylor at intel.com>
Date: Tue, 27 Aug 2024 18:04:25 -0700
Subject: [PATCH 2/2] Update fp-contract.cu test
---
clang/test/CodeGenCUDA/fp-contract.cu | 29 +++++++++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/clang/test/CodeGenCUDA/fp-contract.cu b/clang/test/CodeGenCUDA/fp-contract.cu
index 58558f956d06de..8ca43282e58a10 100644
--- a/clang/test/CodeGenCUDA/fp-contract.cu
+++ b/clang/test/CodeGenCUDA/fp-contract.cu
@@ -68,6 +68,29 @@
// RUN: -O3 -target-cpu gfx906 -o - -x ir %t.ll \
// RUN: | FileCheck -check-prefixes=COMMON,AMD-OPT-FASTSTD %s
+// Explicit -ffp-contract=fast (was fast-honor-pragmas)
+// In IR, fmul/fadd instructions with contract flag are emitted.
+// In backend
+// nvptx/amdgcn - assumes standard fp fuse option, which only
+// fuses mult/add insts with contract flag or
+// llvm.fmuladd intrinsics.
+
+// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
+// RUN: -ffp-contract=fast -disable-llvm-passes -o - %s \
+// RUN: | FileCheck -check-prefixes=COMMON,NV-ON %s
+// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -S \
+// RUN: -target-cpu gfx906 -disable-llvm-passes -o - -x hip %s \
+// RUN: -ffp-contract=fast \
+// RUN: | FileCheck -check-prefixes=COMMON,AMD-ON %s
+// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
+// RUN: -O3 -o - %s \
+// RUN: -ffp-contract=fast \
+// RUN: | FileCheck -check-prefixes=COMMON,NV-OPT-FASTSTD %s
+// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -S \
+// RUN: -O3 -target-cpu gfx906 -o - -x hip %s \
+// RUN: -ffp-contract=fast \
+// RUN: | FileCheck -check-prefixes=COMMON,AMD-OPT-FASTSTD %s
+
// Check separate compile/backend steps corresponding to -save-temps.
// When input is IR, -ffp-contract has no effect. Backend uses default
// default FP fuse option.
@@ -243,7 +266,8 @@ __host__ __device__ float func2(float a, float b, float c) {
return t + a;
}
// COMMON-LABEL: _Z5func3fff
-// NV-OPT-FAST: fma.rn.f32
+// NV-OPT-FAST: mul.rn.f32
+// NV-OPT-FAST: add.rn.f32
// NV-OPT-FAST-NEXT: st.param.f32
// NV-OPT-FASTSTD: mul.rn.f32
// NV-OPT-FASTSTD: add.rn.f32
@@ -262,7 +286,8 @@ __host__ __device__ float func2(float a, float b, float c) {
// AMD-OPT-OFF-IR: fmul float
// AMD-OPT-OFF-IR: fadd float
-// AMD-OPT-FAST: v_fmac_f32_e32
+// AMD-OPT-FAST: v_mul_f32_e32
+// AMD-OPT-FAST-NEXT: v_add_f32_e32
// AMD-OPT-FAST-NEXT: s_setpc_b64
// AMD-OPT-FASTSTD: v_mul_f32_e32
// AMD-OPT-FASTSTD-NEXT: v_add_f32_e32
More information about the cfe-commits
mailing list