[PATCH] D149741: [InlineCost][TargetTransformInfo][AMDGPU] Consider cost of alloca instructions in the caller (2/2)

Fri May 19 02:46:59 PDT 2023

jmmartinez added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1251
+  // Below the cutoff, assume that the private memory arrays would be
+  // optimized
+  auto AllocaSize = getCallArgsTotalAllocaSize(CB, DL);
----------------
scchan wrote:
> It's not only private arrays but it also includes private objects which have its address taken and passed to the callee as argument?
Yes, it applies to private objects in general. I've updated the comments to match.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1267
+      (ArgAllocaCost * getInliningThresholdMultiplier() * ArgAllocaSize) /
+      AllocaSize;
+
----------------
scchan wrote:
> We should add your reasoning above to the comment.  I think changing this to `(ArgAllocaCost * getInliningThresholdMultiplier()) * (ArgAllocaSize/AllocaSize)` would make it more apparent that you are getting a fraction of that bonus.
In that case I would have to cast the division to floating point (in my opinion, not a problem) since `ArgAllocaSize / AllocaSize` is always 0 on integers.

Would that be ok?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149741/new/

https://reviews.llvm.org/D149741