[llvm] [NVPTX] Check 'contract' fast-math flag in addition to global options (PR #131372)

Andy Kaylor via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 14 11:22:03 PDT 2025


================
@@ -11,25 +12,79 @@ target triple = "nvptx64-unknown-cuda"
 ;; is free to fuse with a multiply if it is able.  If fusion is not allowed,
 ;; we do not form fma.rn at the PTX level and explicitly generate add.rn
 ;; for all adds to prevent ptxas from fusion the ops.
-
-;; FAST-LABEL: @t0
-;; DEFAULT-LABEL: @t0
 define float @t0(float %a, float %b, float %c) {
-;; FAST: fma.rn.f32
-;; DEFAULT: mul.rn.f32
-;; DEFAULT: add.rn.f32
+; FAST-LABEL: t0(
+; FAST:       {
+; FAST-NEXT:    .reg .f32 %f<5>;
+; FAST-EMPTY:
+; FAST-NEXT:  // %bb.0:
+; FAST-NEXT:    ld.param.f32 %f1, [t0_param_0];
+; FAST-NEXT:    ld.param.f32 %f2, [t0_param_1];
+; FAST-NEXT:    ld.param.f32 %f3, [t0_param_2];
+; FAST-NEXT:    fma.rn.f32 %f4, %f1, %f2, %f3;
+; FAST-NEXT:    st.param.f32 [func_retval0], %f4;
+; FAST-NEXT:    ret;
+;
+; DEFAULT-LABEL: t0(
----------------
andykaylor wrote:

Is it worth adding a variation of this test that has `contract` set on the operations to verify that FMA is formed in that case without `-fp-contract=fast`, or is that covered by other tests?

https://github.com/llvm/llvm-project/pull/131372


More information about the llvm-commits mailing list