[PATCH] D122804: [AMDGPU] Only count global-to-global as indirect accesses

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 31 05:55:57 PDT 2022


foad created this revision.
Herald added subscribers: hsmhsm, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
Herald added a project: All.
foad requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

Previously any load (global, local or constant) feeding into a
global load or store would be counted as an indirect access. This
patch only counts global loads feeding into a global load or store.
The rationale is that the latency for global loads is generally
much larger than the other kinds.

As a side effect this makes it easier to write small kernels test
cases that are not counted as having indirect accesses, despite
the fact that arguments to the kernel are accessed with an SMEM
load.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D122804

Files:
  llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/perfhint.ll
  llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit2.ll


Index: llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit2.ll
===================================================================
--- llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit2.ll
+++ llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit2.ll
@@ -6,8 +6,8 @@
 ; SI-MINREG: NumSgprs: {{[1-9]$}}
 ; SI-MINREG: NumVgprs: {{[1-9]$}}
 
-; SI-MAXOCC: NumSgprs: {{[0-4][0-9]$}}
-; SI-MAXOCC: NumVgprs: {{[0-4][0-9]$}}
+; SI-MAXOCC: NumSgprs: {{[1-4]?[0-9]$}}
+; SI-MAXOCC: NumVgprs: {{[1-4]?[0-9]$}}
 
 ; stores may alias loads
 ; VI: NumSgprs: {{[0-9]$}}
Index: llvm/test/CodeGen/AMDGPU/perfhint.ll
===================================================================
--- llvm/test/CodeGen/AMDGPU/perfhint.ll
+++ llvm/test/CodeGen/AMDGPU/perfhint.ll
@@ -75,10 +75,9 @@
   ret void
 }
 
-; FIXME: This test was intended to be WaveLimiterHint : 0
 ; GCN-LABEL: {{^}}test_indirect_through_phi:
 ; GCN: MemoryBound: 0
-; GCN: WaveLimiterHint : 1
+; GCN: WaveLimiterHint : 0
 define amdgpu_kernel void @test_indirect_through_phi(float addrspace(1)* %arg) {
 bb:
   %load = load float, float addrspace(1)* %arg, align 8
Index: llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp
@@ -153,7 +153,7 @@
 
     if (auto LD = dyn_cast<LoadInst>(V)) {
       auto M = LD->getPointerOperand();
-      if (isGlobalAddr(M) || isLocalAddr(M) || isConstantAddr(M)) {
+      if (isGlobalAddr(M)) {
         LLVM_DEBUG(dbgs() << "    is IA\n");
         return true;
       }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D122804.419413.patch
Type: text/x-patch
Size: 1631 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220331/fe75501a/attachment.bin>


More information about the llvm-commits mailing list