[PATCH] [X86, AVX] recognize shufflevector with zero input as a vperm2 (PR22984)
Sanjay Patel
spatel at rotateright.com
Tue Mar 24 10:54:14 PDT 2015
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:9081
@@ -9080,3 +9067,20 @@
- return DAG.getNode(ISD::CONCAT_VECTORS, DL, VT, LoV, HiV);
+ bool IsV1Zero = ISD::isBuildVectorAllZeros(V1.getNode());
+ bool IsV2Zero = ISD::isBuildVectorAllZeros(V2.getNode());
+
+ // If either input operand is a zero vector, use VPERM2X128 because its mask
+ // allows us to replace the zero input with an implicit zero.
+ if (!IsV1Zero && !IsV2Zero) {
+ // Check for patterns which can be matched with a single insert of a 128-bit
+ // subvector.
+ bool OnlyUsesV1 = isShuffleEquivalent(V1, V2, Mask, {0, 1, 0, 1});
+ if (OnlyUsesV1 || isShuffleEquivalent(V1, V2, Mask, {0, 1, 4, 5})) {
+ MVT SubVT = MVT::getVectorVT(VT.getVectorElementType(),
+ VT.getVectorNumElements() / 2);
+ SDValue LoV = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, SubVT, V1,
+ DAG.getIntPtrConstant(0));
+ SDValue HiV = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, SubVT,
+ OnlyUsesV1 ? V1 : V2, DAG.getIntPtrConstant(0));
+ return DAG.getNode(ISD::CONCAT_VECTORS, DL, VT, LoV, HiV);
+ }
}
----------------
Note: This mask {0, 1, 6, 7} is a v4x64 blend, but we've already tried "lowerVectorShuffleAsBlend()" above. Therefore, this check is redundant, and I've removed it. There was no change in the regression tests after removing this check.
http://reviews.llvm.org/D8563
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list