[PATCH] D149741: [InlineCost][TargetTransformInfo][AMDGPU] Consider cost of alloca instructions in the caller (2/2)
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 10 23:56:49 PDT 2023
Pierre-vh added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1209
-unsigned GCNTTIImpl::adjustInliningThreshold(const CallBase *CB) const {
+static size_t getTotalAllocaSize(const CallBase *CB, const DataLayout &DL) {
// If we have a pointer to private array passed into a function
----------------
The way I understand this function is that it does a sum of the size of all the allocas used to pass arguments to a function, but only takes allocas that are in flat/private into account
If that's correct, I would rename the function to something more like `getCallArgsTotalAllocaSize` to reflect it.
Also nit: we seem to generally use `unsigned` for size types in LLVM, I also prefer `size_t` but I would just stay consistent with the `unsigned` below and also use `unsigned` here
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1210-1212
// If we have a pointer to private array passed into a function
// it will not be optimized out, leaving scratch usage.
// Increase the inline threshold to allow inlining in this case.
----------------
Comment probably needs updating - this function now just calculate the size of all allocas, it doesn't adjust the inlining threshold
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1235-1237
+ if (AllocaSize > 0) {
+ Threshold += ArgAllocaCost;
+ }
----------------
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1250-1251
+
+ // give an inline threshold bonus depending on the size of the alloca that is
+ // being optimized by SROA. the bigger the array is, the better chances we
+ // have to avoid scratch memory
----------------
This is called during inlining cost calculations right? So I'd rephrase it as "that may be SROA'd" - it currently reads as this is being used by SROA
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1257
+ AllocaSize;
+ // awkwardly, this gets multiplied by the vector and single-bb bonuses
+ return ThresholdBonus;
----------------
Is it an issue? Can you elaborate a bit more?
================
Comment at: llvm/test/Transforms/Inline/AMDGPU/amdgpu-inline-alloca-argument-cost.ll:95
+declare void @external(ptr addrspace(5) %p)
\ No newline at end of file
----------------
newline at end of file :)
(I also had this a lot until I found a setting in my IDE to automatically insert it on save, if you use VSCode it's easy)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D149741/new/
https://reviews.llvm.org/D149741
More information about the llvm-commits
mailing list