[llvm] c87c61c - [AMDGPU] Fix AGPR offset for waitcnt

Joe Nash via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 12:38:20 PST 2022


Author: Joe Nash
Date: 2022-02-14T15:16:21-05:00
New Revision: c87c61c52cad32576597fb6de764863f21b2ee7e

URL: https://github.com/llvm/llvm-project/commit/c87c61c52cad32576597fb6de764863f21b2ee7e
DIFF: https://github.com/llvm/llvm-project/commit/c87c61c52cad32576597fb6de764863f21b2ee7e.diff

LOG: [AMDGPU] Fix AGPR offset for waitcnt

An enum value stores the offset between AGPR ranges and VGPR
ranges in the internal storage of SIInsertWaitcnts. It said 226 when
it should say 256, causing some portion of the ranges to overlap. That
in turn causes 'aliasing' between the registers, potentially inserting
waitcnts that are not required.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D119749

Added: 
    

Modified: 
    llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
    llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index f8a10bc8ef6f..8508a3bfc5c2 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -119,7 +119,7 @@ static const unsigned WaitEventMaskForInst[NUM_INST_CNTS] = {
 // special tokens like SCMEM_LDS (needed for buffer load to LDS).
 enum RegisterMapping {
   SQ_MAX_PGM_VGPRS = 512, // Maximum programmable VGPRs across all targets.
-  AGPR_OFFSET = 226, // Maximum programmable ArchVGPRs across all targets.
+  AGPR_OFFSET = 256,      // Maximum programmable ArchVGPRs across all targets.
   SQ_MAX_PGM_SGPRS = 256, // Maximum programmable SGPRs across all targets.
   NUM_EXTRA_VGPRS = 1,    // A reserved slot for DS.
   EXTRA_VGPR_LDS = 0,     // This is a placeholder the Shader algorithm uses.

diff  --git a/llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir b/llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir
index 83b6d5b749e1..9841a8cd0b10 100644
--- a/llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir
+++ b/llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir
@@ -335,7 +335,6 @@ body: |
 
 ---
 # agpr should be disjoint and tracked separately from vgpr
-# vgpr226 and agpr0 erroneously share waitcnt storage index, so a waitcnt is inserted before store of agpr0 when it is not needed
 
 name: high_register_collision
 
@@ -347,7 +346,6 @@ body: |
     ; GCN-NEXT: $vgpr226 = FLAT_LOAD_DWORD $vgpr6_vgpr7, 0, 0, implicit $exec, implicit $flat_scr
     ; GCN-NEXT: $vgpr4_vgpr5 = V_LSHLREV_B64_e64 4, $vgpr8_vgpr9, implicit $exec
     ; GCN-NEXT: FLAT_STORE_DWORD $vgpr4_vgpr5, $agpr1, 0, 0, implicit $exec, implicit $flat_scr
-    ; GCN-NEXT: S_WAITCNT 112
     ; GCN-NEXT: FLAT_STORE_DWORD $vgpr4_vgpr5, $agpr0, 0, 0, implicit $exec, implicit $flat_scr
     ; GCN-NEXT: S_ENDPGM 0
     $agpr0 = V_ACCVGPR_MOV_B32 $agpr1, implicit $exec


        


More information about the llvm-commits mailing list