[llvm] [AMDGPU] Vectorize i8 Shuffles (PR #95840)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 1 14:04:03 PDT 2024
================
@@ -337,9 +349,11 @@ unsigned GCNTTIImpl::getMinVectorRegisterBitWidth() const {
unsigned GCNTTIImpl::getMaximumVF(unsigned ElemWidth, unsigned Opcode) const {
if (Opcode == Instruction::Load || Opcode == Instruction::Store)
return 32 * 4 / ElemWidth;
- return (ElemWidth == 16 && ST->has16BitInsts()) ? 2
- : (ElemWidth == 32 && ST->hasPackedFP32Ops()) ? 2
- : 1;
+
+ return (ElemWidth == 8) ? 4
+ : (ElemWidth == 16 && ST->has16BitInsts()) ? 2
----------------
arsenm wrote:
By the same reasoning as the 8-bit case, the 16-bit case shouldn't really depend on has16BitInsts
https://github.com/llvm/llvm-project/pull/95840
More information about the llvm-commits
mailing list