[llvm] [AMDGPU] Increase inline threshold when the callee only has one live use (PR #111311)

via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 7 02:17:26 PDT 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-backend-amdgpu

Author: Shilei Tian (shiltian)

<details>
<summary>Changes</summary>

Currently we will not inline a large function even if it only has one live use.
This could significantly impact the performance because CSR spill is very
expensive. The goal of this PR is trying to force the inlining if there is only
one live use by adjusting the inlining threshold, which is a configurable
number. The default value is 15000, which borrows from
`InlineConstants::LastCallToStaticBonus`. I'm not sure if this is a good number,
and if this is the right way to do that. After making this change, the callee in
my local test case can finally be inlined, but the cost is still very close to
the threshold: `cost=14010, threshold=170775`.

Speaking of the test, how are we gonna test this? Do we want to include a giant
IR file?

Fixes SWDEV-471398.

---
Full diff: https://github.com/llvm/llvm-project/pull/111311.diff


1 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp (+10) 


``````````diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
index d348166c2d9a04..debc3db78974ad 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
@@ -75,6 +75,10 @@ static cl::opt<size_t> InlineMaxBB(
     cl::desc("Maximum number of BBs allowed in a function after inlining"
              " (compile time constraint)"));
 
+static cl::opt<unsigned> InlineThresholdOneLiveUse(
+    "amdgpu-inline-threshold-one-live-use", cl::Hidden, cl::init(15000),
+    cl::desc("Threshold added when the callee only has one live use"));
+
 static bool dependsOnLocalPhi(const Loop *L, const Value *Cond,
                               unsigned Depth = 0) {
   const Instruction *I = dyn_cast<Instruction>(Cond);
@@ -1307,6 +1311,12 @@ unsigned GCNTTIImpl::adjustInliningThreshold(const CallBase *CB) const {
   unsigned AllocaSize = getCallArgsTotalAllocaSize(CB, DL);
   if (AllocaSize > 0)
     Threshold += ArgAllocaCost;
+
+  // Increase the threshold if it is the only call to a local function.
+  Function *Callee = CB->getCalledFunction();
+  if (Callee->hasLocalLinkage() && Callee->hasOneLiveUse())
+    Threshold += InlineThresholdOneLiveUse;
+
   return Threshold;
 }
 

``````````

</details>


https://github.com/llvm/llvm-project/pull/111311


More information about the llvm-commits mailing list