[llvm] c246b7b - [AMDGPU] Only count global-to-global as indirect accesses

Fri Apr 1 05:51:05 PDT 2022

Author: Jay Foad
Date: 2022-04-01T13:48:13+01:00
New Revision: c246b7bd4a5191d48f68ce12b50e03bfadd2a0b5

URL: https://github.com/llvm/llvm-project/commit/c246b7bd4a5191d48f68ce12b50e03bfadd2a0b5
DIFF: https://github.com/llvm/llvm-project/commit/c246b7bd4a5191d48f68ce12b50e03bfadd2a0b5.diff

LOG: [AMDGPU] Only count global-to-global as indirect accesses

Previously any load (global, local or constant) feeding into a
global load or store would be counted as an indirect access. This
patch only counts global loads feeding into a global load or store.
The rationale is that the latency for global loads is generally
much larger than the other kinds.

As a side effect this makes it easier to write small kernels test
cases that are not counted as having indirect accesses, despite
the fact that arguments to the kernel are accessed with an SMEM
load.

Differential Revision: https://reviews.llvm.org/D122804

Added: 
    

Modified: 
    llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp
    llvm/test/CodeGen/AMDGPU/perfhint.ll
    llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit2.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp
index de97b76b1e093..b994b53c21db0 100644

--- a/llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp
@@ -153,7 +153,7 @@ bool AMDGPUPerfHint::isIndirectAccess(const Instruction *Inst) const {
 
     if (auto LD = dyn_cast<LoadInst>(V)) {
       auto M = LD->getPointerOperand();
-      if (isGlobalAddr(M) || isLocalAddr(M) || isConstantAddr(M)) {
+      if (isGlobalAddr(M)) {
         LLVM_DEBUG(dbgs() << "    is IA\n");
         return true;
       }

diff  --git a/llvm/test/CodeGen/AMDGPU/perfhint.ll b/llvm/test/CodeGen/AMDGPU/perfhint.ll
index 2fe01e8f55aeb..296eeabf0ae5d 100644
--- a/llvm/test/CodeGen/AMDGPU/perfhint.ll
+++ b/llvm/test/CodeGen/AMDGPU/perfhint.ll
@@ -75,10 +75,9 @@ bb:
   ret void
 }
 
-; FIXME: This test was intended to be WaveLimiterHint : 0
 ; GCN-LABEL: {{^}}test_indirect_through_phi:
 ; GCN: MemoryBound: 0
-; GCN: WaveLimiterHint : 1
+; GCN: WaveLimiterHint : 0
 define amdgpu_kernel void @test_indirect_through_phi(float addrspace(1)* %arg) {
 bb:
   %load = load float, float addrspace(1)* %arg, align 8

diff  --git a/llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit2.ll b/llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit2.ll
index d8dac0b1d36bd..e209f9e6196d5 100644
--- a/llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit2.ll
+++ b/llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit2.ll
@@ -6,8 +6,8 @@
 ; SI-MINREG: NumSgprs: {{[1-9]$}}
 ; SI-MINREG: NumVgprs: {{[1-9]$}}
 
-; SI-MAXOCC: NumSgprs: {{[0-4][0-9]$}}
-; SI-MAXOCC: NumVgprs: {{[0-4][0-9]$}}
+; SI-MAXOCC: NumSgprs: {{[1-4]?[0-9]$}}
+; SI-MAXOCC: NumVgprs: {{[1-4]?[0-9]$}}
 
 ; stores may alias loads
 ; VI: NumSgprs: {{[0-9]$}}