[PATCH] D149741: [InlineCost][TargetTransformInfo][AMDGPU] Consider cost of alloca instructions in the caller (2/2)

Wed May 10 23:56:49 PDT 2023

Pierre-vh added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1209

-unsigned GCNTTIImpl::adjustInliningThreshold(const CallBase *CB) const {
+static size_t getTotalAllocaSize(const CallBase *CB, const DataLayout &DL) {
   // If we have a pointer to private array passed into a function
----------------
The way I understand this function is that it does a sum of the size of all the allocas used to pass arguments to a function, but only takes allocas that are in flat/private into account
If that's correct, I would rename the function to something more like `getCallArgsTotalAllocaSize` to reflect it.

Also nit: we seem to generally use `unsigned` for size types in LLVM, I also prefer `size_t` but I would just stay consistent with the `unsigned` below and also use `unsigned` here

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1210-1212
   // If we have a pointer to private array passed into a function
   // it will not be optimized out, leaving scratch usage.
   // Increase the inline threshold to allow inlining in this case.
----------------
Comment probably needs updating - this function now just calculate the size of all allocas, it doesn't adjust the inlining threshold

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1235-1237
+  if (AllocaSize > 0) {
+    Threshold += ArgAllocaCost;
+  }
----------------

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1250-1251
+
+  // give an inline threshold bonus depending on the size of the alloca that is
+  // being optimized by SROA. the bigger the array is, the better chances we
+  // have to avoid scratch memory
----------------
This is called during inlining cost calculations right? So I'd rephrase it as "that may be SROA'd"  - it currently reads as this is being used by SROA

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1257
+      AllocaSize;
+  // awkwardly, this gets multiplied by the vector and single-bb bonuses
+  return ThresholdBonus;
----------------
Is it an issue? Can you elaborate a bit more?

================
Comment at: llvm/test/Transforms/Inline/AMDGPU/amdgpu-inline-alloca-argument-cost.ll:95
+declare void @external(ptr addrspace(5) %p)
\ No newline at end of file

----------------
newline at end of file :) 
(I also had this a lot until I found a setting in my IDE to automatically insert it on save, if you use VSCode it's easy)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149741/new/

https://reviews.llvm.org/D149741