[llvm] [AMDGPU] Remove s_delay_alu for VALU->SGPR->SALU (PR #127212)

via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 20 04:50:00 PST 2025


================
@@ -559,3 +560,79 @@ body: |
     $vgpr0 = V_WRITELANE_B32 $sgpr0, 3, $vgpr0
     $vgpr0 = V_ADD_U32_e32 $vgpr0, $vgpr0, implicit $exec
 ...
+
+# Check if s_delay_alu is added
+---
+name: redundant_delay_alu_1
+body: |
+  bb.0:
+    ; CHECK-LABEL: redundant_delay_alu_1:
+    ; CHECK:       ; %bb.0:
+    ; CHECK-NEXT:    v_cmp_eq_u32_e64 s[0:1], s0, s1
+    ; CHECK-NEXT:    v_mul_f32_e64 v0, v0, v0
+    ; CHECK-NEXT:    s_or_b32 s0, s0, s1
+    ; CHECK-NEXT:    v_mul_f32_e64 v0, v0, v0
+    $sgpr0_sgpr1 = V_CMP_EQ_U32_e64 $sgpr0, $sgpr1, implicit $exec
+    $vgpr0 = V_MUL_F32_e64 0, $vgpr0, 0, $vgpr0, 0, 0, implicit $mode, implicit $exec
+    $sgpr0= S_OR_B32 $sgpr0, $sgpr1, implicit-def $scc
+    $vgpr0 = V_MUL_F32_e64 0, $vgpr0, 0, $vgpr0, 0, 0, implicit $mode, implicit $exec
+...
+
+# Check if s_delay_alu is added
----------------
mihajlovicana wrote:

I am confused why do we need delay_alu here. When I look at the the state after each instruction, it says that the second sgpr write waits for 3 cycles, reducing the first vgpr write cycles by 3, leaving it at 1. After the second sgpr write is issued, the cycles for write drop to 0, removing it from list

https://github.com/llvm/llvm-project/pull/127212


More information about the llvm-commits mailing list