[PATCH] D140242: [AMDGPU] Modify adjustInliningThreshold to also consider the cost of passing function arguments through the stack
Siu Chi Chan via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 12 11:54:13 PST 2023
scchan added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1204
+ adjustThreshold += std::max(0, SGPRsInUse - 26) * ArgStackInlinePenalty;
+ adjustThreshold += std::max(0, VGPRsInUse - 32) * ArgStackInlinePenalty;
+ return adjustThreshold;
----------------
JanekvO wrote:
> scchan wrote:
> > arsenm wrote:
> > > scchan wrote:
> > > > I guess it's subtracting the number of clobbered registers - instead of a hardcoded value, could that be replaced by something more meaningful like a const variable or a getter?
> > > >
> > > > Also shouldn't VGPRs have a higher penalty relative to SGPRs since they'd occupy more stack space?
> > > We only sort of handle SGPR arguments today, and not for compute. We also do not currently implement the optimization of packing SGPRs into a VGPR for the argument spill
> > I wasn't paying attention to the comments for ArgStackInlinePenalty. The cost model is only based on the number of instructions and it doesn't take storage into account.
> I couldn't infer what measurement unit the inliner cost/threshold uses so I took a cost model relative from the cost of a single instruction. Do let me know if the storage cost should be considered (and possibly with what amount).
I was thinking about the contribution of a stack's size to the overall size of the scratch since that may add penalty to the launch overhead. A VGPR store would take more stack space than a SGPR store and therefore has a higher cost (relatively speaking)? I don't know how to model it at the moment but just suggesting that would be something to consider.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D140242/new/
https://reviews.llvm.org/D140242
More information about the llvm-commits
mailing list