[llvm] [AMDGPU] Enable vectorization of i8 values. (PR #134934)

Jeffrey Byrnes via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 23 16:47:01 PDT 2025


================
@@ -344,9 +344,10 @@ unsigned GCNTTIImpl::getMinVectorRegisterBitWidth() const {
 unsigned GCNTTIImpl::getMaximumVF(unsigned ElemWidth, unsigned Opcode) const {
   if (Opcode == Instruction::Load || Opcode == Instruction::Store)
     return 32 * 4 / ElemWidth;
-  return (ElemWidth == 16 && ST->has16BitInsts()) ? 2
-       : (ElemWidth == 32 && ST->hasPackedFP32Ops()) ? 2
-       : 1;
+  return ElemWidth == 8                                ? 4
----------------
jrbyrnes wrote:

The TTI::get*Cost queries are called in many places. Can you please double check that the ones we have implemented in AMDGPUTargetTransformInfo either revert to the base handling or are explicitly customized for i8

https://github.com/llvm/llvm-project/pull/134934


More information about the llvm-commits mailing list