[PATCH] D119749: [AMDGPU] Fix AGPR offset for waitcnt

Joe Nash via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 11:22:04 PST 2022


Joe_Nash created this revision.
Joe_Nash added reviewers: rampitec, foad, bsaleil, kosarev.
Herald added subscribers: kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
Joe_Nash requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

[AMDGPU] Pre-commit test for wait between agpr & vgpr

Due to a typo of 256 to 226, the SIInsertWaitcnt pass thinks
several registers are aliased from a waitcnt PoV including vgpr226
and agpr0, vgpr227 and agpr1...

This is a test of the behavior.
NFC.

[AMDGPU] Fix AGPR offset for waitcnt

An enum value stores the offset between AGPR ranges and VGPR
ranges in the internal storage of SIInsertWaitcnts. It said 226 when
it should say 256, causing some portion of the ranges to overlap. That
in turn causes 'aliasing' between the registers, potentially inserting
waitcnts that are not required.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D119749

Files:
  llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
  llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir


Index: llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir
+++ llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir
@@ -335,7 +335,6 @@
 
 ---
 # agpr should be disjoint and tracked separately from vgpr
-# vgpr226 and agpr0 erroneously share waitcnt storage index, so a waitcnt is inserted before store of agpr0 when it is not needed
 
 name: high_register_collision
 
@@ -347,7 +346,6 @@
     ; GCN-NEXT: $vgpr226 = FLAT_LOAD_DWORD $vgpr6_vgpr7, 0, 0, implicit $exec, implicit $flat_scr
     ; GCN-NEXT: $vgpr4_vgpr5 = V_LSHLREV_B64_e64 4, $vgpr8_vgpr9, implicit $exec
     ; GCN-NEXT: FLAT_STORE_DWORD $vgpr4_vgpr5, $agpr1, 0, 0, implicit $exec, implicit $flat_scr
-    ; GCN-NEXT: S_WAITCNT 112
     ; GCN-NEXT: FLAT_STORE_DWORD $vgpr4_vgpr5, $agpr0, 0, 0, implicit $exec, implicit $flat_scr
     ; GCN-NEXT: S_ENDPGM 0
     $agpr0 = V_ACCVGPR_MOV_B32 $agpr1, implicit $exec
Index: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -119,7 +119,7 @@
 // special tokens like SCMEM_LDS (needed for buffer load to LDS).
 enum RegisterMapping {
   SQ_MAX_PGM_VGPRS = 512, // Maximum programmable VGPRs across all targets.
-  AGPR_OFFSET = 226, // Maximum programmable ArchVGPRs across all targets.
+  AGPR_OFFSET = 256,      // Maximum programmable ArchVGPRs across all targets.
   SQ_MAX_PGM_SGPRS = 256, // Maximum programmable SGPRs across all targets.
   NUM_EXTRA_VGPRS = 1,    // A reserved slot for DS.
   EXTRA_VGPR_LDS = 0,     // This is a placeholder the Shader algorithm uses.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D119749.408523.patch
Type: text/x-patch
Size: 1741 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220214/eb5c8abf/attachment.bin>


More information about the llvm-commits mailing list