[llvm] [AArch64] Add @llvm.experimental.vector.match (PR #101974)
Paul Walker via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 8 06:43:06 PST 2024
================
@@ -5761,6 +5774,84 @@ SDValue LowerSMELdrStr(SDValue N, SelectionDAG &DAG, bool IsLoad) {
DAG.getTargetConstant(ImmAddend, DL, MVT::i32)});
}
+SDValue LowerVectorMatch(SDValue Op, SelectionDAG &DAG) {
+ SDLoc dl(Op);
+ SDValue ID =
+ DAG.getTargetConstant(Intrinsic::aarch64_sve_match, dl, MVT::i64);
+
+ auto Op1 = Op.getOperand(1);
+ auto Op2 = Op.getOperand(2);
+ auto Mask = Op.getOperand(3);
+
+ EVT Op1VT = Op1.getValueType();
+ EVT Op2VT = Op2.getValueType();
+ EVT ResVT = Op.getValueType();
+
+ assert((Op1VT.getVectorElementType() == MVT::i8 ||
+ Op1VT.getVectorElementType() == MVT::i16) &&
+ "Expected 8-bit or 16-bit characters.");
+
+ // Scalable vector type used to wrap operands.
+ // A single container is enough for both operands because ultimately the
+ // operands will have to be wrapped to the same type (nxv16i8 or nxv8i16).
+ EVT OpContainerVT = Op1VT.isScalableVector()
+ ? Op1VT
+ : getContainerForFixedLengthVector(DAG, Op1VT);
+
+ // Wrap Op2 in a scalable register, and splat it if necessary.
+ if (Op1VT.getVectorMinNumElements() == Op2VT.getVectorNumElements()) {
+ // If Op1 and Op2 have the same number of elements we can trivially wrap
+ // Op2 in an SVE register.
+ Op2 = convertToScalableVector(DAG, OpContainerVT, Op2);
+ // If the result is scalable, we need to broadcast Op2 to a full SVE
+ // register.
+ if (ResVT.isScalableVector())
+ Op2 = DAG.getNode(AArch64ISD::DUPLANE128, dl, OpContainerVT, Op2,
+ DAG.getTargetConstant(0, dl, MVT::i64));
+ } else {
+ // If Op1 and Op2 have different number of elements, we need to broadcast
+ // Op2. Ideally we would use a AArch64ISD::DUPLANE* node for this
+ // similarly to the above, but unfortunately we seem to be missing some
+ // patterns for this. So, in alternative, we splat Op2 through a splat of
+ // a scalable vector extract. This idiom, though a bit more verbose, is
+ // supported and get us the MOV instruction we want.
+ unsigned Op2BitWidth = Op2VT.getFixedSizeInBits();
+ MVT Op2IntVT = MVT::getIntegerVT(Op2BitWidth);
+ MVT Op2PromotedVT = MVT::getVectorVT(Op2IntVT, 128 / Op2BitWidth,
----------------
paulwalker-arm wrote:
I know this is due to change but for completeness I think `shouldExpandVectorMatch` enables the case where the search vector is `v8i8` with a needle vector of `v16i8`? but this else block is essentially assuming the needle vector is 64-bit. My guess is it'll be easier to lockdown `shouldExpandVectorMatch` given you're less bothered about searching fixed length vectors.
Alternatively, given you know that either `Op2VT.is64BitVector()` or `Op2VT.is128BitVector()` must be true, there might be a way to think in those terms rather than basing it on the number of elements.
https://github.com/llvm/llvm-project/pull/101974
More information about the llvm-commits
mailing list