[PATCH] D145586: [AMDGPU] Tweak PromoteAlloca limits

Mon Mar 27 00:28:12 PDT 2023

Pierre-vh marked an inline comment as done.
Pierre-vh added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:402

-  // Use up to 1/4 of available register budget for vectorization.
+  // Use up to 1/2 of available register budget for vectorization if we have
+  // >=64 MaxVGPRs, otherwise use 1/4.
----------------
arsenm wrote:
> Half feels pretty aggressive
What's a better limit? 1/3 (but then we get odd numbers) ?
Or do we leave it at 1/2 and just remove the CC limit?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:405-406
+  // If PromoteAllocaToVectorLimit is used, also use 1/4.
   unsigned Limit = PromoteAllocaToVectorLimit ? PromoteAllocaToVectorLimit * 8
                                               : (MaxVGPRs * 32);
+  const unsigned SizeFactor =
----------------
arsenm wrote:
> The largest register class we support is <32 x i32>, do we definitely never introduce larger vectors?
I think anything not supported by the DAG will get split up in pieces that are supported, no?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145586/new/

https://reviews.llvm.org/D145586