[llvm] [AMDGPU]: Accept constant zero bytes in v_perm OrCombine (PR #66533)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 6 08:36:49 PDT 2023
================
@@ -10700,6 +10700,23 @@ calculateSrcByte(const SDValue Op, uint64_t DestByte, uint64_t SrcIndex = 0,
return calculateSrcByte(Op->getOperand(0), DestByte, SrcIndex, Depth + 1);
}
+ case ISD::EXTRACT_VECTOR_ELT: {
+ auto IdxOp = dyn_cast<ConstantSDNode>(Op->getOperand(1));
+ if (!IdxOp)
+ return std::nullopt;
+ auto VecIdx = IdxOp->getZExtValue();
+ auto ScalarSize = Op.getScalarValueSizeInBits();
+ if (ScalarSize != 32) {
+ if ((VecIdx + 1) * ScalarSize > 32)
+ return std::nullopt;
+ SrcIndex = ScalarSize == 8 ? VecIdx : VecIdx * 2 + SrcIndex;
----------------
arsenm wrote:
This seems to be assuming 8 or 16-bit elements, why not just compute from the value?
https://github.com/llvm/llvm-project/pull/66533
More information about the llvm-commits
mailing list