[PATCH] D122850: [AMDGPU] Fix regression with vectorization limiting
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 31 14:51:36 PDT 2022
rampitec added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:313
+ // interleaving loops, so we lie to avoid trying to use all registers.
+ return std::min(NumRegs, 4u);
}
----------------
arsenm wrote:
> 4 seems really small
It is enough to allow vectorization, all we need really. Giving more immediately explodes RP because of the interleaving. That can be possible to increase this, but then limit interleaving much more.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D122850/new/
https://reviews.llvm.org/D122850
More information about the llvm-commits
mailing list