[PATCH] D82990: [AMDGPU] Limit promote alloca to vector with VGPR budget

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 1 13:32:09 PDT 2020


rampitec marked an inline comment as done.
rampitec added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:439
 
+  // Use up to 1/4 of available register budget for vectorization.
+  if (DL.getTypeSizeInBits(AllocaTy) * 4 > MaxVGPRs * 32) {
----------------
arsenm wrote:
> This seems like a huge default. I thought we previously limited this to 16 VGPRs. There should also probably be a cl::opt for this too
We did not limit it to 16 VGPRs but to 16 elements. This limit is still here. The problem this limit solves is different, it is when you only have limited number of registers not to run out of them. Like wg size 1024 leaves us with 64 VGPRs and then the limit would be 16.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82990/new/

https://reviews.llvm.org/D82990





More information about the llvm-commits mailing list