[llvm] [AMDGPU] Handle hazard in v_scalef32_sr_fp4_* conversions (PR #118589)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 4 05:50:49 PST 2024
================
@@ -636,6 +670,24 @@ body: |
S_SETPC_B64_return undef $sgpr30_sgpr31, implicit killed $vgpr0
...
+---
+# GCN-LABEL: test_cvt_scale_cvt_scalef32_sr_pk_fp4_f16_opsel0_hazard
+# GCN: V_CVT_SCALEF32_PK_FP4_F16_e64
+# GCN: S_NOP 0
+# GCN: V_CVT_SCALEF32_SR_PK_FP4_F16_e64
+# GCN: S_NOP 0
+# GCN: S_SETPC_B64_return
+name: test_cvt_scale_cvt_scalef32_sr_pk_fp4_f16_opsel0_hazard
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
+ S_WAITCNT 0
+ renamable $vgpr2 = V_CVT_SCALEF32_PK_FP4_F16_e64 8, $vgpr0, 0, $vgpr1, 4, killed $vgpr2, 0, implicit $mode, implicit $exec
+ early-clobber renamable $vgpr4 = V_CVT_SCALEF32_SR_PK_FP4_F16_e64 0, killed $vgpr0, 0, killed $vgpr3, 0, killed $vgpr1, killed $vgpr2, 0, implicit $mode, implicit $exec
+ $vgpr0 = V_MOV_B32_e32 killed $vgpr4, implicit $exec, implicit $exec
+ S_SETPC_B64_return undef $sgpr30_sgpr31, implicit killed $vgpr0
+...
+
----------------
arsenm wrote:
Should comprehensively test all the FP4 opcodes. I think I count 6? Also negative test the FP4-as-source case
https://github.com/llvm/llvm-project/pull/118589
More information about the llvm-commits
mailing list