[PATCH] D145586: [AMDGPU] Tweak PromoteAlloca limits
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 27 00:28:12 PDT 2023
Pierre-vh marked an inline comment as done.
Pierre-vh added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:402
- // Use up to 1/4 of available register budget for vectorization.
+ // Use up to 1/2 of available register budget for vectorization if we have
+ // >=64 MaxVGPRs, otherwise use 1/4.
----------------
arsenm wrote:
> Half feels pretty aggressive
What's a better limit? 1/3 (but then we get odd numbers) ?
Or do we leave it at 1/2 and just remove the CC limit?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:405-406
+ // If PromoteAllocaToVectorLimit is used, also use 1/4.
unsigned Limit = PromoteAllocaToVectorLimit ? PromoteAllocaToVectorLimit * 8
: (MaxVGPRs * 32);
+ const unsigned SizeFactor =
----------------
arsenm wrote:
> The largest register class we support is <32 x i32>, do we definitely never introduce larger vectors?
I think anything not supported by the DAG will get split up in pieces that are supported, no?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D145586/new/
https://reviews.llvm.org/D145586
More information about the llvm-commits
mailing list