[llvm] [AMDGPU] Add hazard tests for cvt scale of fp4. (PR #118813)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 5 07:19:52 PST 2024


================
@@ -797,6 +1124,47 @@ body:             |
     S_SETPC_B64_return undef $sgpr30_sgpr31, implicit killed $vgpr0
 ...
 
+---
+name:            test_cvt_scale_cvt_scalef32_fp4_f16_opsel0_neg_fp4_as_src_hazard
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
+    ; GCN-LABEL: name: test_cvt_scale_cvt_scalef32_fp4_f16_opsel0_neg_fp4_as_src_hazard
+    ; GCN: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: S_WAITCNT 0
+    ; GCN-NEXT: renamable $vgpr2 = V_CVT_SCALEF32_PK_FP4_F16_e64 0, $vgpr0, 0, $vgpr1, 0, killed $vgpr2, 0, implicit $mode, implicit $exec
+    ; GCN-NEXT: early-clobber renamable $vgpr4 = V_CVT_SCALEF32_SR_PK_FP4_F16_e64 8, killed $vgpr0, 0, killed $vgpr3, 4, killed $vgpr1, killed $vgpr2, 0, implicit $mode, implicit $exec
+    ; GCN-NEXT: $vgpr0 = V_MOV_B32_e32 killed $vgpr1, implicit $exec, implicit $exec
+    ; GCN-NEXT: S_SETPC_B64_return undef $sgpr30_sgpr31, implicit killed $vgpr0
+    S_WAITCNT 0
+    renamable $vgpr2 = V_CVT_SCALEF32_PK_FP4_F16_e64 0, $vgpr0, 0, $vgpr1, 0, killed $vgpr2, 0, implicit $mode, implicit $exec
+    early-clobber renamable $vgpr4 = V_CVT_SCALEF32_SR_PK_FP4_F16_e64 8, killed $vgpr0, 0, killed $vgpr3, 4, killed $vgpr1, killed $vgpr2, 0, implicit $mode, implicit $exec
+    $vgpr0 = V_MOV_B32_e32 killed $vgpr1, implicit $exec, implicit $exec
+    S_SETPC_B64_return undef $sgpr30_sgpr31, implicit killed $vgpr0
+...
+
+---
+name:            test_cvt_scale_cvt_scalef32_fp4_f16_opsel3_neg_fp4_as_src_hazard
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
+    ; GCN-LABEL: name: test_cvt_scale_cvt_scalef32_fp4_f16_opsel3_neg_fp4_as_src_hazard
+    ; GCN: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: S_WAITCNT 0
+    ; GCN-NEXT: renamable $vgpr2 = V_CVT_SCALEF32_PK_FP4_F16_e64 8, $vgpr0, 0, $vgpr1, 4, killed $vgpr2, 0, implicit $mode, implicit $exec
+    ; GCN-NEXT: S_NOP 0
+    ; GCN-NEXT: early-clobber renamable $vgpr4 = V_CVT_SCALEF32_SR_PK_FP4_F16_e64 8, killed $vgpr0, 0, killed $vgpr3, 4, killed $vgpr1, killed $vgpr2, 0, implicit $mode, implicit $exec
+    ; GCN-NEXT: $vgpr0 = V_MOV_B32_e32 killed $vgpr1, implicit $exec, implicit $exec
+    ; GCN-NEXT: S_SETPC_B64_return undef $sgpr30_sgpr31, implicit killed $vgpr0
+    S_WAITCNT 0
+    renamable $vgpr2 = V_CVT_SCALEF32_PK_FP4_F16_e64 8, $vgpr0, 0, $vgpr1, 4, killed $vgpr2, 0, implicit $mode, implicit $exec
+    early-clobber renamable $vgpr4 = V_CVT_SCALEF32_SR_PK_FP4_F16_e64 8, killed $vgpr0, 0, killed $vgpr3, 4, killed $vgpr1, killed $vgpr2, 0, implicit $mode, implicit $exec
+    $vgpr0 = V_MOV_B32_e32 killed $vgpr1, implicit $exec, implicit $exec
+    S_SETPC_B64_return undef $sgpr30_sgpr31, implicit killed $vgpr0
+...
+
 ---
----------------
arsenm wrote:

A few negative tests for the fp4 sources?

https://github.com/llvm/llvm-project/pull/118813


More information about the llvm-commits mailing list