[llvm] [NVPTX] Check 'contract' fast-math flag in addition to global options (PR #131372)
Alex MacLean via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 14 11:47:01 PDT 2025
================
@@ -11,25 +12,79 @@ target triple = "nvptx64-unknown-cuda"
;; is free to fuse with a multiply if it is able. If fusion is not allowed,
;; we do not form fma.rn at the PTX level and explicitly generate add.rn
;; for all adds to prevent ptxas from fusion the ops.
-
-;; FAST-LABEL: @t0
-;; DEFAULT-LABEL: @t0
define float @t0(float %a, float %b, float %c) {
-;; FAST: fma.rn.f32
-;; DEFAULT: mul.rn.f32
-;; DEFAULT: add.rn.f32
+; FAST-LABEL: t0(
+; FAST: {
+; FAST-NEXT: .reg .f32 %f<5>;
+; FAST-EMPTY:
+; FAST-NEXT: // %bb.0:
+; FAST-NEXT: ld.param.f32 %f1, [t0_param_0];
+; FAST-NEXT: ld.param.f32 %f2, [t0_param_1];
+; FAST-NEXT: ld.param.f32 %f3, [t0_param_2];
+; FAST-NEXT: fma.rn.f32 %f4, %f1, %f2, %f3;
+; FAST-NEXT: st.param.f32 [func_retval0], %f4;
+; FAST-NEXT: ret;
+;
+; DEFAULT-LABEL: t0(
----------------
AlexMaclean wrote:
Sounds good, I've added this case as well.
https://github.com/llvm/llvm-project/pull/131372
More information about the llvm-commits
mailing list