[PATCH] D126025: AMDGPU: allow reordering of functions in AMDGPUResourceUsageAnalysis

Jacob Weightman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 19 15:19:13 PDT 2022


jweightman created this revision.
jweightman added a reviewer: AMDGPU.
jweightman added a project: AMDGPU.
Herald added subscribers: kosarev, jsilvanus, hsmhsm, foad, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl.
Herald added a project: All.
jweightman requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

The AMDGPUResourceUsageAnalysis was previously a CGSCC pass, and assumed
that a function's callees were always analyzed prior to their callees.
When it was refactored into a module pass, this assumption no longer
always holds. This results in calls being erroneously identified as
indirect, and reserving private segment space for them. This results in
significantly slower kernel launch latency.

This patch changes the order in which the module's functions are analyzed
from the order in which they occur in the module to a post-order traversal
of the call graph. Perhaps Clang always generates the module's functions
in such an order, but this is not the case for the Cray Fortran compiler.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D126025

Files:
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/hsa-metadata-resource-usage-function-ordering.ll


Index: llvm/test/CodeGen/AMDGPU/hsa-metadata-resource-usage-function-ordering.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/hsa-metadata-resource-usage-function-ordering.ll
@@ -0,0 +1,28 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 --amdhsa-code-object-version=4 -enable-misched=0 -filetype=obj -o - < %s | llvm-readelf --notes - | FileCheck %s
+; RUN: llc -mattr=-xnack -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=4 -mcpu=gfx803 -enable-misched=0 -filetype=obj -o - < %s | llvm-readelf --notes - | FileCheck %s
+; RUN: llc -mattr=-xnack -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=4 -mcpu=gfx900 -enable-misched=0 -filetype=obj -o - < %s | llvm-readelf --notes - | FileCheck %s
+; RUN: llc -mattr=-xnack -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=4 -mcpu=gfx1010 -enable-misched=0 -filetype=obj -o - < %s | llvm-readelf --notes - | FileCheck %s
+
+; CHECK: amdhsa.kernels:
+
+; test a kernel that occurs before its callee in the module
+; CHECK:   - .args:
+; CHECK:     .private_segment_fixed_size: 0
+define amdgpu_kernel void @test1() {
+  call void @f()
+  ret void
+}
+
+define void @f() #0 {
+  ret void
+}
+
+; test a kernel that occurs after its callee in the module
+; CHECK:   - .args:
+; CHECK:     .private_segment_fixed_size: 0
+define amdgpu_kernel void @test2() {
+  call void @f()
+  ret void
+}
+
+attributes #0 = { norecurse }
\ No newline at end of file
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -27,6 +27,7 @@
 #include "AMDGPU.h"
 #include "GCNSubtarget.h"
 #include "SIMachineFunctionInfo.h"
+#include "llvm/ADT/PostOrderIterator.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/CodeGen/MachineFrameInfo.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
@@ -105,15 +106,19 @@
   const TargetMachine &TM = TPC->getTM<TargetMachine>();
   bool HasIndirectCall = false;
 
-  for (Function &F : M) {
-    if (F.isDeclaration())
+  CallGraph CG = CallGraph(M);
+  auto End = po_end(&CG);
+
+  for (auto IT = po_begin(&CG); IT != End; ++IT) {
+    Function *F = IT->getFunction();
+    if (!F || F->isDeclaration())
       continue;
 
-    MachineFunction *MF = MMI.getMachineFunction(F);
+    MachineFunction *MF = MMI.getMachineFunction(*F);
     assert(MF && "function must have been generated already");
 
     auto CI = CallGraphResourceInfo.insert(
-        std::make_pair(&F, SIFunctionResourceInfo()));
+        std::make_pair(F, SIFunctionResourceInfo()));
     SIFunctionResourceInfo &Info = CI.first->second;
     assert(CI.second && "should only be called once per function");
     Info = analyzeResourceUsage(*MF, TM);


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D126025.430822.patch
Type: text/x-patch
Size: 2878 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220519/e7b7debd/attachment.bin>


More information about the llvm-commits mailing list