[llvm] [AMDGPU] Add test for GCNRegPressure tracker bug (PR #73786)

Valery Pykhtin via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 29 04:44:06 PST 2023


================
@@ -531,3 +531,126 @@ body:             |
     %1:vgpr_32 = V_MOV_B32_e32 %0, implicit $exec
     S_NOP 0, implicit %1
 ...
+---
+name: movrel
+tracksRegLiveness: true
+body: |
+  ; RPU-LABEL: name: movrel
+  ; RPU: bb.0:
+  ; RPU-NEXT:   Live-in:
+  ; RPU-NEXT:   SGPR  VGPR
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     0      $sgpr0 = COPY $sgpr1
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     0      $sgpr2_sgpr3 = S_GETPC_B64
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     0      $sgpr1 = COPY killed $sgpr3
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     0      $sgpr0_sgpr1_sgpr2_sgpr3 = S_LOAD_DWORDX4_IMM $sgpr0_sgpr1, 0, 0
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     0      $sgpr0 = S_BUFFER_LOAD_DWORD_IMM $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     0      undef %0.sub5:vreg_512 = V_MOV_B32_e32 5, implicit $exec
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     0      S_CMP_GT_U32 $sgpr0, 15, implicit-def $scc
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     0      S_CBRANCH_SCC1 %bb.2, implicit $scc
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     0      S_BRANCH %bb.1
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   Live-out:
+  ; RPU-NEXT: bb.1:
+  ; RPU-NEXT:   Live-in:
+  ; RPU-NEXT:   SGPR  VGPR
+  ; RPU-NEXT:   0     0
+  ; RPU-NEXT:   0     1      undef %0.sub5:vreg_512 = V_MOV_B32_e32 5, implicit $exec
+  ; RPU-NEXT:   0     1
+  ; RPU-NEXT:   0     1      $m0 = S_MOV_B32 killed $sgpr0
+  ; RPU-NEXT:   0     1
+  ; RPU-NEXT:   0     1      %0:vreg_512 = V_INDIRECT_REG_WRITE_MOVREL_B32_V16 %0:vreg_512(tied-def 0), 42, 3, implicit $m0, implicit $exec
+  ; RPU-NEXT:   0     1
+  ; RPU-NEXT:   Live-out: %0:0000000000000C00
+  ; RPU-NEXT: bb.2:
+  ; RPU-NEXT:   Live-in:  %0:0000000000000C00
+  ; RPU-NEXT:   SGPR  VGPR
+  ; RPU-NEXT:   0     1
+  ; RPU-NEXT:   0     1      %1:vgpr_32 = V_CVT_F32_UBYTE0_e64 %0.sub5:vreg_512, 0, 0, implicit $exec
----------------
vpykhtin wrote:

RPD is accounting for the whole _vreg_512_ "at" the instruction level though. If some of the lanes are reused after the instruction they should be already decremented from the pressure. It looks like RPD approach is more correct here.

https://github.com/llvm/llvm-project/pull/73786


More information about the llvm-commits mailing list