[PATCH] D119749: [AMDGPU] Fix AGPR offset for waitcnt

Joe Nash via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 11:29:40 PST 2022


Joe_Nash updated this revision to Diff 408527.
Joe_Nash added a comment.

pre-commit test separately and update summary


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119749/new/

https://reviews.llvm.org/D119749

Files:
  llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
  llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir


Index: llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir
+++ llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir
@@ -335,7 +335,6 @@
 
 ---
 # agpr should be disjoint and tracked separately from vgpr
-# vgpr226 and agpr0 erroneously share waitcnt storage index, so a waitcnt is inserted before store of agpr0 when it is not needed
 
 name: high_register_collision
 
@@ -347,7 +346,6 @@
     ; GCN-NEXT: $vgpr226 = FLAT_LOAD_DWORD $vgpr6_vgpr7, 0, 0, implicit $exec, implicit $flat_scr
     ; GCN-NEXT: $vgpr4_vgpr5 = V_LSHLREV_B64_e64 4, $vgpr8_vgpr9, implicit $exec
     ; GCN-NEXT: FLAT_STORE_DWORD $vgpr4_vgpr5, $agpr1, 0, 0, implicit $exec, implicit $flat_scr
-    ; GCN-NEXT: S_WAITCNT 112
     ; GCN-NEXT: FLAT_STORE_DWORD $vgpr4_vgpr5, $agpr0, 0, 0, implicit $exec, implicit $flat_scr
     ; GCN-NEXT: S_ENDPGM 0
     $agpr0 = V_ACCVGPR_MOV_B32 $agpr1, implicit $exec
Index: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -119,7 +119,7 @@
 // special tokens like SCMEM_LDS (needed for buffer load to LDS).
 enum RegisterMapping {
   SQ_MAX_PGM_VGPRS = 512, // Maximum programmable VGPRs across all targets.
-  AGPR_OFFSET = 226, // Maximum programmable ArchVGPRs across all targets.
+  AGPR_OFFSET = 256,      // Maximum programmable ArchVGPRs across all targets.
   SQ_MAX_PGM_SGPRS = 256, // Maximum programmable SGPRs across all targets.
   NUM_EXTRA_VGPRS = 1,    // A reserved slot for DS.
   EXTRA_VGPR_LDS = 0,     // This is a placeholder the Shader algorithm uses.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D119749.408527.patch
Type: text/x-patch
Size: 1741 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220214/fa9e8f26/attachment.bin>


More information about the llvm-commits mailing list