[llvm] [AMDGPU] Handle hazard in v_scalef32_sr_fp4_* conversions (PR #118589)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 4 05:50:49 PST 2024


================
@@ -636,6 +670,24 @@ body:             |
     S_SETPC_B64_return undef $sgpr30_sgpr31, implicit killed $vgpr0
 ...
 
+---
+# GCN-LABEL: test_cvt_scale_cvt_scalef32_sr_pk_fp4_f16_opsel0_hazard
+# GCN:      V_CVT_SCALEF32_PK_FP4_F16_e64
+# GCN:      S_NOP 0
+# GCN:      V_CVT_SCALEF32_SR_PK_FP4_F16_e64
+# GCN:      S_NOP 0
+# GCN:      S_SETPC_B64_return
+name:            test_cvt_scale_cvt_scalef32_sr_pk_fp4_f16_opsel0_hazard
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
+    S_WAITCNT 0
+    renamable $vgpr2 = V_CVT_SCALEF32_PK_FP4_F16_e64 8, $vgpr0, 0, $vgpr1, 4, killed $vgpr2, 0, implicit $mode, implicit $exec
+    early-clobber renamable $vgpr4 = V_CVT_SCALEF32_SR_PK_FP4_F16_e64 0, killed $vgpr0, 0, killed $vgpr3, 0, killed $vgpr1, killed $vgpr2, 0, implicit $mode, implicit $exec
+    $vgpr0 = V_MOV_B32_e32 killed $vgpr4, implicit $exec, implicit $exec
+    S_SETPC_B64_return undef $sgpr30_sgpr31, implicit killed $vgpr0
+...
+
----------------
arsenm wrote:

Should comprehensively test all the FP4 opcodes. I think I count 6? Also negative test the FP4-as-source case

https://github.com/llvm/llvm-project/pull/118589


More information about the llvm-commits mailing list