[llvm] [AMDGPU] Enable vectorization of i8 values. (PR #134934)
Jeffrey Byrnes via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 23 16:47:01 PDT 2025
================
@@ -344,9 +344,10 @@ unsigned GCNTTIImpl::getMinVectorRegisterBitWidth() const {
unsigned GCNTTIImpl::getMaximumVF(unsigned ElemWidth, unsigned Opcode) const {
if (Opcode == Instruction::Load || Opcode == Instruction::Store)
return 32 * 4 / ElemWidth;
- return (ElemWidth == 16 && ST->has16BitInsts()) ? 2
- : (ElemWidth == 32 && ST->hasPackedFP32Ops()) ? 2
- : 1;
+ return ElemWidth == 8 ? 4
----------------
jrbyrnes wrote:
The TTI::get*Cost queries are called in many places. Can you please double check that the ones we have implemented in AMDGPUTargetTransformInfo either revert to the base handling or are explicitly customized for i8
https://github.com/llvm/llvm-project/pull/134934
More information about the llvm-commits
mailing list