[llvm] [AMDGPU]: Accept constant zero bytes in v_perm OrCombine (PR #66533)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 6 08:36:49 PDT 2023


================
@@ -10700,6 +10700,23 @@ calculateSrcByte(const SDValue Op, uint64_t DestByte, uint64_t SrcIndex = 0,
     return calculateSrcByte(Op->getOperand(0), DestByte, SrcIndex, Depth + 1);
   }
 
+  case ISD::EXTRACT_VECTOR_ELT: {
+    auto IdxOp = dyn_cast<ConstantSDNode>(Op->getOperand(1));
+    if (!IdxOp)
+      return std::nullopt;
+    auto VecIdx = IdxOp->getZExtValue();
+    auto ScalarSize = Op.getScalarValueSizeInBits();
+    if (ScalarSize != 32) {
+      if ((VecIdx + 1) * ScalarSize > 32)
+        return std::nullopt;
+      SrcIndex = ScalarSize == 8 ? VecIdx : VecIdx * 2 + SrcIndex;
----------------
arsenm wrote:

This seems to be assuming 8 or 16-bit elements, why not just compute from the value?

https://github.com/llvm/llvm-project/pull/66533


More information about the llvm-commits mailing list