[llvm] [AMDGPU] Add support for preloading implicit kernel arguments (PR #83817)

Austin Kerbow via llvm-commits llvm-commits at lists.llvm.org
Sun May 5 22:56:13 PDT 2024


================
@@ -64,6 +64,86 @@ class PreloadKernelArgInfo {
     NumFreeUserSGPRs -= (NumPreloadSGPRs + PaddingSGPRs);
     return true;
   }
+
+  // Try to allocate SGPRs to preload implicit kernel arguments.
+  void tryAllocImplicitArgPreloadSGPRs(unsigned ImplicitArgsBaseOffset,
+                                       IRBuilder<> &Builder) {
+    IntrinsicInst *ImplicitArgPtr = nullptr;
+    for (Function::iterator B = F.begin(), BE = F.end(); B != BE; ++B) {
+      for (BasicBlock::iterator I = B->begin(), IE = B->end(); I != IE; ++I) {
+        if (IntrinsicInst *CI = dyn_cast<IntrinsicInst>(I))
+          if (CI->getIntrinsicID() == Intrinsic::amdgcn_implicitarg_ptr) {
+            ImplicitArgPtr = CI;
+            break;
+          }
+      }
+    }
+
+    if (!ImplicitArgPtr)
+      return;
+
+    const DataLayout &DL = F.getParent()->getDataLayout();
+    // Pair is the load and the load offset.
+    SmallVector<std::pair<LoadInst *, unsigned>, 4> ImplicitArgLoads;
+    for (auto *U : ImplicitArgPtr->users()) {
+      if (!U->hasOneUse())
+        continue;
----------------
kerbowa wrote:

`U` is either a GEP with one user which should be the load of the implicit argument, or `U` is a load that uses the implicitarg ptr directly. This is the same logic that is in AMDGPULowerKernelAttributes, but come to think of it maybe we should avoid the check for a single user in both places because in the case where we directly use the implicitarg ptr that load may have multiple uses. I guess it's only relevant for block_count_x though (offset 0).

https://github.com/llvm/llvm-project/pull/83817


More information about the llvm-commits mailing list