[PATCH] D140242: [AMDGPU] Modify adjustInliningThreshold to also consider the cost of passing function arguments through the stack
Janek van Oirschot via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 12 08:28:16 PST 2023
JanekvO added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:86-88
+static cl::opt<unsigned> ArgStackInlinePenalty(
+ "amdgpu-inline-arg-stack-cost", cl::Hidden, cl::init(15),
+ cl::desc("Cost per argument for function arguments passed through stack"));
----------------
arsenm wrote:
> Should be able to compute this directly from the existing costs for stack stores
Sorry, I couldn't find any constant or cl option for stack store costs. Did you have anything in mind to replace this cl option?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1189-1191
+ // Outer kernel functions can't be inlined.
+ if (llvm::AMDGPU::isKernelCC(Callee))
+ return 0;
----------------
arsenm wrote:
> No reason to specially consider them?
Removed as inlining will never call kernel functions as callee.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1204
+ adjustThreshold += std::max(0, SGPRsInUse - 26) * ArgStackInlinePenalty;
+ adjustThreshold += std::max(0, VGPRsInUse - 32) * ArgStackInlinePenalty;
+ return adjustThreshold;
----------------
scchan wrote:
> arsenm wrote:
> > scchan wrote:
> > > I guess it's subtracting the number of clobbered registers - instead of a hardcoded value, could that be replaced by something more meaningful like a const variable or a getter?
> > >
> > > Also shouldn't VGPRs have a higher penalty relative to SGPRs since they'd occupy more stack space?
> > We only sort of handle SGPR arguments today, and not for compute. We also do not currently implement the optimization of packing SGPRs into a VGPR for the argument spill
> I wasn't paying attention to the comments for ArgStackInlinePenalty. The cost model is only based on the number of instructions and it doesn't take storage into account.
I couldn't infer what measurement unit the inliner cost/threshold uses so I took a cost model relative from the cost of a single instruction. Do let me know if the storage cost should be considered (and possibly with what amount).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D140242/new/
https://reviews.llvm.org/D140242
More information about the llvm-commits
mailing list