[llvm] r321967 - [X86] Make v2i1 and v4i1 legal types without VLX
Craig Topper via llvm-commits
llvm-commits at lists.llvm.org
Sun Jan 7 10:20:37 PST 2018
Author: ctopper
Date: Sun Jan 7 10:20:37 2018
New Revision: 321967
URL: http://llvm.org/viewvc/llvm-project?rev=321967&view=rev
Log:
[X86] Make v2i1 and v4i1 legal types without VLX
Summary:
There are few oddities that occur due to v1i1, v8i1, v16i1 being legal without v2i1 and v4i1 being legal when we don't have VLX. Particularly during legalization of v2i32/v4i32/v2i64/v4i64 masked gather/scatter/load/store. We end up promoting the mask argument to these during type legalization and then have to widen the promoted type to v8iX/v16iX and truncate it to get the element size back down to v8i1/v16i1 to use a 512-bit operation. Since need to fill the upper bits of the mask we have to fill with 0s at the promoted type.
It would be better if we could just have the v2i1/v4i1 types as legal so they don't undergo any promotion. Then we can just widen with 0s directly in a k register. There are no real v4i1/v2i1 instructions anyway. Everything is done on a larger register anyway.
This also fixes an issue that we couldn't implement a masked vextractf32x4 from zmm to xmm properly.
We now have to support widening more compares to 512-bit to get a mask result out so new tablegen patterns got added.
I had to hack the legalizer for widening the operand of a setcc a bit so it didn't try create a setcc returning v4i32, extract from it, then try to promote it using a sign extend to v2i1. Now we create the setcc with v4i1 if the original setcc's result type is v2i1. Then extract that and don't sign extend it at all.
There's definitely room for improvement with some follow up patches.
Reviewers: RKSimon, zvi, guyblank
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41560
Modified:
llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
llvm/trunk/lib/Target/X86/X86InstrAVX512.td
llvm/trunk/lib/Target/X86/X86InstrVecCompiler.td
llvm/trunk/test/Analysis/CostModel/X86/cast.ll
llvm/trunk/test/CodeGen/X86/avx512-cvt.ll
llvm/trunk/test/CodeGen/X86/avx512-ext.ll
llvm/trunk/test/CodeGen/X86/avx512-insert-extract.ll
llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll
llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
llvm/trunk/test/CodeGen/X86/avx512-vec-cmp.ll
llvm/trunk/test/CodeGen/X86/avx512dq-intrinsics-upgrade.ll
llvm/trunk/test/CodeGen/X86/avx512vl-vec-cmp.ll
llvm/trunk/test/CodeGen/X86/avx512vl-vec-masked-cmp.ll
llvm/trunk/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll
llvm/trunk/test/CodeGen/X86/compress_expand.ll
llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll
llvm/trunk/test/CodeGen/X86/masked_memop.ll
llvm/trunk/test/CodeGen/X86/pr33349.ll
llvm/trunk/test/CodeGen/X86/sse-fsignum.ll
llvm/trunk/test/CodeGen/X86/vector-shuffle-v1.ll
llvm/trunk/test/CodeGen/X86/vselect-pcmp.ll
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp Sun Jan 7 10:20:37 2018
@@ -3616,6 +3616,7 @@ SDValue DAGTypeLegalizer::WidenVecOp_SET
SDValue InOp0 = GetWidenedVector(N->getOperand(0));
SDValue InOp1 = GetWidenedVector(N->getOperand(1));
SDLoc dl(N);
+ EVT VT = N->getValueType(0);
// WARNING: In this code we widen the compare instruction with garbage.
// This garbage may contain denormal floats which may be slow. Is this a real
@@ -3625,18 +3626,23 @@ SDValue DAGTypeLegalizer::WidenVecOp_SET
// Only some of the compared elements are legal.
EVT SVT = TLI.getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(),
InOp0.getValueType());
+ // The result type is legal, if its vXi1, keep vXi1 for the new SETCC.
+ if (VT.getScalarType() == MVT::i1)
+ SVT = EVT::getVectorVT(*DAG.getContext(), MVT::i1,
+ SVT.getVectorNumElements());
+
SDValue WideSETCC = DAG.getNode(ISD::SETCC, SDLoc(N),
- SVT, InOp0, InOp1, N->getOperand(2));
+ SVT, InOp0, InOp1, N->getOperand(2));
// Extract the needed results from the result vector.
EVT ResVT = EVT::getVectorVT(*DAG.getContext(),
SVT.getVectorElementType(),
- N->getValueType(0).getVectorNumElements());
+ VT.getVectorNumElements());
SDValue CC = DAG.getNode(
ISD::EXTRACT_SUBVECTOR, dl, ResVT, WideSETCC,
DAG.getConstant(0, dl, TLI.getVectorIdxTy(DAG.getDataLayout())));
- return PromoteTargetBoolean(CC, N->getValueType(0));
+ return PromoteTargetBoolean(CC, VT);
}
Modified: llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86ISelDAGToDAG.cpp Sun Jan 7 10:20:37 2018
@@ -460,7 +460,7 @@ static bool isLegalMaskCompare(SDNode *N
// this happens we will use 512-bit operations and the mask will not be
// zero extended.
EVT OpVT = N->getOperand(0).getValueType();
- if (OpVT == MVT::v8i32 || OpVT == MVT::v8f32)
+ if (OpVT.is256BitVector() || OpVT.is128BitVector())
return Subtarget->hasVLX();
return true;
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sun Jan 7 10:20:37 2018
@@ -1144,6 +1144,8 @@ X86TargetLowering::X86TargetLowering(con
addRegisterClass(MVT::v8f64, &X86::VR512RegClass);
addRegisterClass(MVT::v1i1, &X86::VK1RegClass);
+ addRegisterClass(MVT::v2i1, &X86::VK2RegClass);
+ addRegisterClass(MVT::v4i1, &X86::VK4RegClass);
addRegisterClass(MVT::v8i1, &X86::VK8RegClass);
addRegisterClass(MVT::v16i1, &X86::VK16RegClass);
@@ -1171,15 +1173,14 @@ X86TargetLowering::X86TargetLowering(con
setOperationAction(ISD::FP_TO_UINT, MVT::v2i1, Custom);
}
- // Extends of v16i1/v8i1 to 128-bit vectors.
- setOperationAction(ISD::SIGN_EXTEND, MVT::v16i8, Custom);
- setOperationAction(ISD::ZERO_EXTEND, MVT::v16i8, Custom);
- setOperationAction(ISD::ANY_EXTEND, MVT::v16i8, Custom);
- setOperationAction(ISD::SIGN_EXTEND, MVT::v8i16, Custom);
- setOperationAction(ISD::ZERO_EXTEND, MVT::v8i16, Custom);
- setOperationAction(ISD::ANY_EXTEND, MVT::v8i16, Custom);
+ // Extends of v16i1/v8i1/v4i1/v2i1 to 128-bit vectors.
+ for (auto VT : { MVT::v16i8, MVT::v8i16, MVT::v4i32, MVT::v2i64 }) {
+ setOperationAction(ISD::SIGN_EXTEND, VT, Custom);
+ setOperationAction(ISD::ZERO_EXTEND, VT, Custom);
+ setOperationAction(ISD::ANY_EXTEND, VT, Custom);
+ }
- for (auto VT : { MVT::v8i1, MVT::v16i1 }) {
+ for (auto VT : { MVT::v2i1, MVT::v4i1, MVT::v8i1, MVT::v16i1 }) {
setOperationAction(ISD::ADD, VT, Custom);
setOperationAction(ISD::SUB, VT, Custom);
setOperationAction(ISD::MUL, VT, Custom);
@@ -1195,9 +1196,12 @@ X86TargetLowering::X86TargetLowering(con
}
setOperationAction(ISD::CONCAT_VECTORS, MVT::v16i1, Custom);
+ setOperationAction(ISD::CONCAT_VECTORS, MVT::v8i1, Custom);
+ setOperationAction(ISD::CONCAT_VECTORS, MVT::v4i1, Custom);
+ setOperationAction(ISD::INSERT_SUBVECTOR, MVT::v4i1, Custom);
setOperationAction(ISD::INSERT_SUBVECTOR, MVT::v8i1, Custom);
setOperationAction(ISD::INSERT_SUBVECTOR, MVT::v16i1, Custom);
- for (auto VT : { MVT::v1i1, MVT::v8i1 })
+ for (auto VT : { MVT::v1i1, MVT::v2i1, MVT::v4i1, MVT::v8i1 })
setOperationAction(ISD::EXTRACT_SUBVECTOR, VT, Custom);
for (MVT VT : MVT::fp_vector_valuetypes())
@@ -1528,41 +1532,6 @@ X86TargetLowering::X86TargetLowering(con
}
if (!Subtarget.useSoftFloat() && Subtarget.hasVLX()) {
- addRegisterClass(MVT::v4i1, &X86::VK4RegClass);
- addRegisterClass(MVT::v2i1, &X86::VK2RegClass);
-
- for (auto VT : { MVT::v2i1, MVT::v4i1 }) {
- setOperationAction(ISD::ADD, VT, Custom);
- setOperationAction(ISD::SUB, VT, Custom);
- setOperationAction(ISD::MUL, VT, Custom);
- setOperationAction(ISD::VSELECT, VT, Expand);
-
- setOperationAction(ISD::TRUNCATE, VT, Custom);
- setOperationAction(ISD::SETCC, VT, Custom);
- setOperationAction(ISD::EXTRACT_VECTOR_ELT, VT, Custom);
- setOperationAction(ISD::INSERT_VECTOR_ELT, VT, Custom);
- setOperationAction(ISD::SELECT, VT, Custom);
- setOperationAction(ISD::BUILD_VECTOR, VT, Custom);
- setOperationAction(ISD::VECTOR_SHUFFLE, VT, Custom);
- }
-
- // TODO: v8i1 concat should be legal without VLX to support concats of
- // v1i1, but we won't legalize it correctly currently without introducing
- // a v4i1 concat in the middle.
- setOperationAction(ISD::CONCAT_VECTORS, MVT::v8i1, Custom);
- setOperationAction(ISD::CONCAT_VECTORS, MVT::v4i1, Custom);
- setOperationAction(ISD::INSERT_SUBVECTOR, MVT::v4i1, Custom);
- for (auto VT : { MVT::v2i1, MVT::v4i1 })
- setOperationAction(ISD::EXTRACT_SUBVECTOR, VT, Custom);
-
- // Extends from v2i1/v4i1 masks to 128-bit vectors.
- setOperationAction(ISD::ZERO_EXTEND, MVT::v4i32, Custom);
- setOperationAction(ISD::ZERO_EXTEND, MVT::v2i64, Custom);
- setOperationAction(ISD::SIGN_EXTEND, MVT::v4i32, Custom);
- setOperationAction(ISD::SIGN_EXTEND, MVT::v2i64, Custom);
- setOperationAction(ISD::ANY_EXTEND, MVT::v4i32, Custom);
- setOperationAction(ISD::ANY_EXTEND, MVT::v2i64, Custom);
-
setTruncStoreAction(MVT::v4i64, MVT::v4i8, Legal);
setTruncStoreAction(MVT::v4i64, MVT::v4i16, Legal);
setTruncStoreAction(MVT::v4i64, MVT::v4i32, Legal);
@@ -4945,8 +4914,6 @@ static SDValue getZeroVector(MVT VT, con
} else if (VT.getVectorElementType() == MVT::i1) {
assert((Subtarget.hasBWI() || VT.getVectorNumElements() <= 16) &&
"Unexpected vector type");
- assert((Subtarget.hasVLX() || VT.getVectorNumElements() >= 8) &&
- "Unexpected vector type");
Vec = DAG.getConstant(0, dl, VT);
} else {
unsigned Num32BitElts = VT.getSizeInBits() / 32;
@@ -17779,6 +17746,19 @@ static SDValue LowerVSETCC(SDValue Op, c
assert(EltVT == MVT::f32 || EltVT == MVT::f64);
#endif
+ // Custom widen MVT::v2f32 to prevent the default widening
+ // from getting a result type of v4i32, extracting it to v2i32 and then
+ // trying to sign extend that to v2i1.
+ if (VT == MVT::v2i1 && Op1.getValueType() == MVT::v2f32) {
+ Op0 = DAG.getNode(ISD::CONCAT_VECTORS, dl, MVT::v4f32, Op0,
+ DAG.getUNDEF(MVT::v2f32));
+ Op1 = DAG.getNode(ISD::CONCAT_VECTORS, dl, MVT::v4f32, Op1,
+ DAG.getUNDEF(MVT::v2f32));
+ SDValue NewOp = DAG.getNode(ISD::SETCC, dl, MVT::v4i1, Op0, Op1, CC);
+ return DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MVT::v2i1, NewOp,
+ DAG.getIntPtrConstant(0, dl));
+ }
+
unsigned Opc;
if (Subtarget.hasAVX512() && VT.getVectorElementType() == MVT::i1) {
assert(VT.getVectorNumElements() <= 16);
@@ -24417,8 +24397,8 @@ static SDValue LowerMSCATTER(SDValue Op,
// Mask
// At this point we have promoted mask operand
- assert(MaskVT.getScalarSizeInBits() >= 32 && "unexpected mask type");
- MVT ExtMaskVT = MVT::getVectorVT(MaskVT.getScalarType(), NumElts);
+ assert(MaskVT.getScalarType() == MVT::i1 && "unexpected mask type");
+ MVT ExtMaskVT = MVT::getVectorVT(MVT::i1, NumElts);
// Use the original mask here, do not modify the mask twice
Mask = ExtendToType(N->getMask(), ExtMaskVT, DAG, true);
@@ -24427,12 +24407,9 @@ static SDValue LowerMSCATTER(SDValue Op,
Src = ExtendToType(Src, NewVT, DAG);
}
}
- // If the mask is "wide" at this point - truncate it to i1 vector
- MVT BitMaskVT = MVT::getVectorVT(MVT::i1, NumElts);
- Mask = DAG.getNode(ISD::TRUNCATE, dl, BitMaskVT, Mask);
// The mask is killed by scatter, add it to the values
- SDVTList VTs = DAG.getVTList(BitMaskVT, MVT::Other);
+ SDVTList VTs = DAG.getVTList(Mask.getValueType(), MVT::Other);
SDValue Ops[] = {Chain, Src, Mask, BasePtr, Index};
SDValue NewScatter = DAG.getTargetMemSDNode<X86MaskedScatterSDNode>(
VTs, Ops, dl, N->getMemoryVT(), N->getMemOperand());
@@ -24455,11 +24432,6 @@ static SDValue LowerMLOAD(SDValue Op, co
assert((!N->isExpandingLoad() || ScalarVT.getSizeInBits() >= 32) &&
"Expanding masked load is supported for 32 and 64-bit types only!");
- // 4x32, 4x64 and 2x64 vectors of non-expanding loads are legal regardless of
- // VLX. These types for exp-loads are handled here.
- if (!N->isExpandingLoad() && VT.getVectorNumElements() <= 4)
- return Op;
-
assert(Subtarget.hasAVX512() && !Subtarget.hasVLX() && !VT.is512BitVector() &&
"Cannot lower masked load op.");
@@ -24476,16 +24448,12 @@ static SDValue LowerMLOAD(SDValue Op, co
Src0 = ExtendToType(Src0, WideDataVT, DAG);
// Mask element has to be i1.
- MVT MaskEltTy = Mask.getSimpleValueType().getScalarType();
- assert((MaskEltTy == MVT::i1 || VT.getVectorNumElements() <= 4) &&
- "We handle 4x32, 4x64 and 2x64 vectors only in this case");
+ assert(Mask.getSimpleValueType().getScalarType() == MVT::i1 &&
+ "Unexpected mask type");
- MVT WideMaskVT = MVT::getVectorVT(MaskEltTy, NumEltsInWideVec);
+ MVT WideMaskVT = MVT::getVectorVT(MVT::i1, NumEltsInWideVec);
Mask = ExtendToType(Mask, WideMaskVT, DAG, true);
- if (MaskEltTy != MVT::i1)
- Mask = DAG.getNode(ISD::TRUNCATE, dl,
- MVT::getVectorVT(MVT::i1, NumEltsInWideVec), Mask);
SDValue NewLoad = DAG.getMaskedLoad(WideDataVT, dl, N->getChain(),
N->getBasePtr(), Mask, Src0,
N->getMemoryVT(), N->getMemOperand(),
@@ -24514,10 +24482,6 @@ static SDValue LowerMSTORE(SDValue Op, c
assert((!N->isCompressingStore() || ScalarVT.getSizeInBits() >= 32) &&
"Expanding masked load is supported for 32 and 64-bit types only!");
- // 4x32 and 2x64 vectors of non-compressing stores are legal regardless to VLX.
- if (!N->isCompressingStore() && VT.getVectorNumElements() <= 4)
- return Op;
-
assert(Subtarget.hasAVX512() && !Subtarget.hasVLX() && !VT.is512BitVector() &&
"Cannot lower masked store op.");
@@ -24532,17 +24496,13 @@ static SDValue LowerMSTORE(SDValue Op, c
MVT WideDataVT = MVT::getVectorVT(ScalarVT, NumEltsInWideVec);
// Mask element has to be i1.
- MVT MaskEltTy = Mask.getSimpleValueType().getScalarType();
- assert((MaskEltTy == MVT::i1 || VT.getVectorNumElements() <= 4) &&
- "We handle 4x32, 4x64 and 2x64 vectors only in this case");
+ assert(Mask.getSimpleValueType().getScalarType() == MVT::i1 &&
+ "Unexpected mask type");
- MVT WideMaskVT = MVT::getVectorVT(MaskEltTy, NumEltsInWideVec);
+ MVT WideMaskVT = MVT::getVectorVT(MVT::i1, NumEltsInWideVec);
DataToStore = ExtendToType(DataToStore, WideDataVT, DAG);
Mask = ExtendToType(Mask, WideMaskVT, DAG, true);
- if (MaskEltTy != MVT::i1)
- Mask = DAG.getNode(ISD::TRUNCATE, dl,
- MVT::getVectorVT(MVT::i1, NumEltsInWideVec), Mask);
return DAG.getMaskedStore(N->getChain(), dl, DataToStore, N->getBasePtr(),
Mask, N->getMemoryVT(), N->getMemOperand(),
N->isTruncatingStore(), N->isCompressingStore());
@@ -24592,12 +24552,9 @@ static SDValue LowerMGATHER(SDValue Op,
Index = DAG.getNode(ISD::SIGN_EXTEND, dl, MVT::v8i64, Index);
// Mask
- MVT MaskBitVT = MVT::getVectorVT(MVT::i1, NumElts);
- // At this point we have promoted mask operand
- assert(MaskVT.getScalarSizeInBits() >= 32 && "unexpected mask type");
- MVT ExtMaskVT = MVT::getVectorVT(MaskVT.getScalarType(), NumElts);
- Mask = ExtendToType(Mask, ExtMaskVT, DAG, true);
- Mask = DAG.getNode(ISD::TRUNCATE, dl, MaskBitVT, Mask);
+ assert(MaskVT.getScalarType() == MVT::i1 && "unexpected mask type");
+ MaskVT = MVT::getVectorVT(MVT::i1, NumElts);
+ Mask = ExtendToType(Mask, MaskVT, DAG, true);
// The pass-through value
MVT NewVT = MVT::getVectorVT(VT.getScalarType(), NumElts);
@@ -24605,7 +24562,7 @@ static SDValue LowerMGATHER(SDValue Op,
SDValue Ops[] = { N->getChain(), Src0, Mask, N->getBasePtr(), Index };
SDValue NewGather = DAG.getTargetMemSDNode<X86MaskedGatherSDNode>(
- DAG.getVTList(NewVT, MaskBitVT, MVT::Other), Ops, dl, N->getMemoryVT(),
+ DAG.getVTList(NewVT, MaskVT, MVT::Other), Ops, dl, N->getMemoryVT(),
N->getMemOperand());
SDValue Extract = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VT,
NewGather.getValue(0),
@@ -30447,7 +30404,7 @@ static SDValue combineBitcast(SDNode *N,
// If this is a bitcast between a MVT::v4i1/v2i1 and an illegal integer
// type, widen both sides to avoid a trip through memory.
if ((VT == MVT::v4i1 || VT == MVT::v2i1) && SrcVT.isScalarInteger() &&
- Subtarget.hasVLX()) {
+ Subtarget.hasAVX512()) {
SDLoc dl(N);
N0 = DAG.getNode(ISD::ANY_EXTEND, dl, MVT::i8, N0);
N0 = DAG.getBitcast(MVT::v8i1, N0);
@@ -30458,7 +30415,7 @@ static SDValue combineBitcast(SDNode *N,
// If this is a bitcast between a MVT::v4i1/v2i1 and an illegal integer
// type, widen both sides to avoid a trip through memory.
if ((SrcVT == MVT::v4i1 || SrcVT == MVT::v2i1) && VT.isScalarInteger() &&
- Subtarget.hasVLX()) {
+ Subtarget.hasAVX512()) {
SDLoc dl(N);
unsigned NumConcats = 8 / SrcVT.getVectorNumElements();
SmallVector<SDValue, 4> Ops(NumConcats, DAG.getUNDEF(SrcVT));
Modified: llvm/trunk/lib/Target/X86/X86InstrAVX512.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrAVX512.td?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86InstrAVX512.td (original)
+++ llvm/trunk/lib/Target/X86/X86InstrAVX512.td Sun Jan 7 10:20:37 2018
@@ -2962,46 +2962,77 @@ multiclass avx512_mask_shiftop_w<bits<8>
defm KSHIFTL : avx512_mask_shiftop_w<0x32, 0x33, "kshiftl", X86kshiftl, SSE_PSHUF>;
defm KSHIFTR : avx512_mask_shiftop_w<0x30, 0x31, "kshiftr", X86kshiftr, SSE_PSHUF>;
-multiclass axv512_icmp_packed_no_vlx_lowering<SDNode OpNode, string InstStr> {
-def : Pat<(v8i1 (OpNode (v8i32 VR256X:$src1), (v8i32 VR256X:$src2))),
- (COPY_TO_REGCLASS (!cast<Instruction>(InstStr##Zrr)
- (v16i32 (INSERT_SUBREG (IMPLICIT_DEF), VR256X:$src1, sub_ymm)),
- (v16i32 (INSERT_SUBREG (IMPLICIT_DEF), VR256X:$src2, sub_ymm))), VK8)>;
-
-def : Pat<(v8i1 (and VK8:$mask,
- (OpNode (v8i32 VR256X:$src1), (v8i32 VR256X:$src2)))),
+multiclass axv512_icmp_packed_no_vlx_lowering<SDNode OpNode, string InstStr,
+ X86VectorVTInfo Narrow,
+ X86VectorVTInfo Wide> {
+def : Pat<(Narrow.KVT (OpNode (Narrow.VT Narrow.RC:$src1),
+ (Narrow.VT Narrow.RC:$src2))),
+ (COPY_TO_REGCLASS
+ (!cast<Instruction>(InstStr##Zrr)
+ (Wide.VT (INSERT_SUBREG (IMPLICIT_DEF), Narrow.RC:$src1, Narrow.SubRegIdx)),
+ (Wide.VT (INSERT_SUBREG (IMPLICIT_DEF), Narrow.RC:$src2, Narrow.SubRegIdx))),
+ Narrow.KRC)>;
+
+def : Pat<(Narrow.KVT (and Narrow.KRC:$mask,
+ (OpNode (Narrow.VT Narrow.RC:$src1),
+ (Narrow.VT Narrow.RC:$src2)))),
(COPY_TO_REGCLASS
(!cast<Instruction>(InstStr##Zrrk)
- (COPY_TO_REGCLASS VK8:$mask, VK16),
- (v16i32 (INSERT_SUBREG (IMPLICIT_DEF), VR256X:$src1, sub_ymm)),
- (v16i32 (INSERT_SUBREG (IMPLICIT_DEF), VR256X:$src2, sub_ymm))),
- VK8)>;
+ (COPY_TO_REGCLASS Narrow.KRC:$mask, Wide.KRC),
+ (Wide.VT (INSERT_SUBREG (IMPLICIT_DEF), Narrow.RC:$src1, Narrow.SubRegIdx)),
+ (Wide.VT (INSERT_SUBREG (IMPLICIT_DEF), Narrow.RC:$src2, Narrow.SubRegIdx))),
+ Narrow.KRC)>;
}
multiclass axv512_icmp_packed_cc_no_vlx_lowering<SDNode OpNode, string InstStr,
- AVX512VLVectorVTInfo _> {
-def : Pat<(v8i1 (OpNode (_.info256.VT VR256X:$src1), (_.info256.VT VR256X:$src2), imm:$cc)),
- (COPY_TO_REGCLASS (!cast<Instruction>(InstStr##Zrri)
- (_.info512.VT (INSERT_SUBREG (IMPLICIT_DEF), VR256X:$src1, sub_ymm)),
- (_.info512.VT (INSERT_SUBREG (IMPLICIT_DEF), VR256X:$src2, sub_ymm)),
- imm:$cc), VK8)>;
-
-def : Pat<(v8i1 (and VK8:$mask, (OpNode (_.info256.VT VR256X:$src1),
- (_.info256.VT VR256X:$src2), imm:$cc))),
- (COPY_TO_REGCLASS (!cast<Instruction>(InstStr##Zrrik)
- (COPY_TO_REGCLASS VK8:$mask, VK16),
- (_.info512.VT (INSERT_SUBREG (IMPLICIT_DEF), VR256X:$src1, sub_ymm)),
- (_.info512.VT (INSERT_SUBREG (IMPLICIT_DEF), VR256X:$src2, sub_ymm)),
- imm:$cc), VK8)>;
+ X86VectorVTInfo Narrow,
+ X86VectorVTInfo Wide> {
+def : Pat<(Narrow.KVT (OpNode (Narrow.VT Narrow.RC:$src1),
+ (Narrow.VT Narrow.RC:$src2), imm:$cc)),
+ (COPY_TO_REGCLASS
+ (!cast<Instruction>(InstStr##Zrri)
+ (Wide.VT (INSERT_SUBREG (IMPLICIT_DEF), Narrow.RC:$src1, Narrow.SubRegIdx)),
+ (Wide.VT (INSERT_SUBREG (IMPLICIT_DEF), Narrow.RC:$src2, Narrow.SubRegIdx)),
+ imm:$cc), Narrow.KRC)>;
+
+def : Pat<(Narrow.KVT (and Narrow.KRC:$mask,
+ (OpNode (Narrow.VT Narrow.RC:$src1),
+ (Narrow.VT Narrow.RC:$src2), imm:$cc))),
+ (COPY_TO_REGCLASS (!cast<Instruction>(InstStr##Zrrik)
+ (COPY_TO_REGCLASS Narrow.KRC:$mask, Wide.KRC),
+ (Wide.VT (INSERT_SUBREG (IMPLICIT_DEF), Narrow.RC:$src1, Narrow.SubRegIdx)),
+ (Wide.VT (INSERT_SUBREG (IMPLICIT_DEF), Narrow.RC:$src2, Narrow.SubRegIdx)),
+ imm:$cc), Narrow.KRC)>;
}
let Predicates = [HasAVX512, NoVLX] in {
- defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpgtm, "VPCMPGTD">;
- defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpeqm, "VPCMPEQD">;
+ defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpgtm, "VPCMPGTD", v8i32x_info, v16i32_info>;
+ defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpeqm, "VPCMPEQD", v8i32x_info, v16i32_info>;
+
+ defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpgtm, "VPCMPGTD", v4i32x_info, v16i32_info>;
+ defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpeqm, "VPCMPEQD", v4i32x_info, v16i32_info>;
+
+ defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpgtm, "VPCMPGTQ", v4i64x_info, v8i64_info>;
+ defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpeqm, "VPCMPEQQ", v4i64x_info, v8i64_info>;
- defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VCMPPS", avx512vl_f32_info>;
- defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VPCMPD", avx512vl_i32_info>;
- defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpmu, "VPCMPUD", avx512vl_i32_info>;
+ defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpgtm, "VPCMPGTQ", v2i64x_info, v8i64_info>;
+ defm : axv512_icmp_packed_no_vlx_lowering<X86pcmpeqm, "VPCMPEQQ", v2i64x_info, v8i64_info>;
+
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VCMPPS", v8f32x_info, v16f32_info>;
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VPCMPD", v8i32x_info, v16i32_info>;
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpmu, "VPCMPUD", v8i32x_info, v16i32_info>;
+
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VCMPPS", v4f32x_info, v16f32_info>;
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VPCMPD", v4i32x_info, v16i32_info>;
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpmu, "VPCMPUD", v4i32x_info, v16i32_info>;
+
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VCMPPD", v4f64x_info, v8f64_info>;
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VPCMPQ", v4i64x_info, v8i64_info>;
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpmu, "VPCMPUQ", v4i64x_info, v8i64_info>;
+
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VCMPPD", v2f64x_info, v8f64_info>;
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpm, "VPCMPQ", v2i64x_info, v8i64_info>;
+ defm : axv512_icmp_packed_cc_no_vlx_lowering<X86cmpmu, "VPCMPUQ", v2i64x_info, v8i64_info>;
}
// Mask setting all 0s or 1s
@@ -3376,8 +3407,15 @@ multiclass mask_move_lowering<string Ins
// Patterns for handling v8i1 selects of 256-bit vectors when VLX isn't
// available. Use a 512-bit operation and extract.
let Predicates = [HasAVX512, NoVLX] in {
+ defm : mask_move_lowering<"VMOVAPSZ", v4f32x_info, v16f32_info>;
+ defm : mask_move_lowering<"VMOVDQA32Z", v4i32x_info, v16i32_info>;
defm : mask_move_lowering<"VMOVAPSZ", v8f32x_info, v16f32_info>;
defm : mask_move_lowering<"VMOVDQA32Z", v8i32x_info, v16i32_info>;
+
+ defm : mask_move_lowering<"VMOVAPDZ", v2f64x_info, v8f64_info>;
+ defm : mask_move_lowering<"VMOVDQA64Z", v2i64x_info, v8i64_info>;
+ defm : mask_move_lowering<"VMOVAPDZ", v4f64x_info, v8f64_info>;
+ defm : mask_move_lowering<"VMOVDQA64Z", v4i64x_info, v8i64_info>;
}
let Predicates = [HasAVX512] in {
Modified: llvm/trunk/lib/Target/X86/X86InstrVecCompiler.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrVecCompiler.td?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86InstrVecCompiler.td (original)
+++ llvm/trunk/lib/Target/X86/X86InstrVecCompiler.td Sun Jan 7 10:20:37 2018
@@ -495,6 +495,18 @@ let Predicates = [HasBWI, HasVLX] in {
// If the bits are not zero we have to fall back to explicitly zeroing by
// using shifts.
+let Predicates = [HasAVX512] in {
+ def : Pat<(v16i1 (insert_subvector (v16i1 immAllZerosV),
+ (v2i1 VK2:$mask), (iPTR 0))),
+ (KSHIFTRWri (KSHIFTLWri (COPY_TO_REGCLASS VK2:$mask, VK16),
+ (i8 14)), (i8 14))>;
+
+ def : Pat<(v16i1 (insert_subvector (v16i1 immAllZerosV),
+ (v4i1 VK4:$mask), (iPTR 0))),
+ (KSHIFTRWri (KSHIFTLWri (COPY_TO_REGCLASS VK4:$mask, VK16),
+ (i8 12)), (i8 12))>;
+}
+
let Predicates = [HasAVX512, NoDQI] in {
def : Pat<(v16i1 (insert_subvector (v16i1 immAllZerosV),
(v8i1 VK8:$mask), (iPTR 0))),
@@ -506,9 +518,7 @@ let Predicates = [HasDQI] in {
def : Pat<(v16i1 (insert_subvector (v16i1 immAllZerosV),
(v8i1 VK8:$mask), (iPTR 0))),
(COPY_TO_REGCLASS (KMOVBkk VK8:$mask), VK16)>;
-}
-let Predicates = [HasVLX, HasDQI] in {
def : Pat<(v8i1 (insert_subvector (v8i1 immAllZerosV),
(v2i1 VK2:$mask), (iPTR 0))),
(KSHIFTRBri (KSHIFTLBri (COPY_TO_REGCLASS VK2:$mask, VK8),
@@ -519,17 +529,6 @@ let Predicates = [HasVLX, HasDQI] in {
(i8 4)), (i8 4))>;
}
-let Predicates = [HasVLX] in {
- def : Pat<(v16i1 (insert_subvector (v16i1 immAllZerosV),
- (v2i1 VK2:$mask), (iPTR 0))),
- (KSHIFTRWri (KSHIFTLWri (COPY_TO_REGCLASS VK2:$mask, VK16),
- (i8 14)), (i8 14))>;
- def : Pat<(v16i1 (insert_subvector (v16i1 immAllZerosV),
- (v4i1 VK4:$mask), (iPTR 0))),
- (KSHIFTRWri (KSHIFTLWri (COPY_TO_REGCLASS VK4:$mask, VK16),
- (i8 12)), (i8 12))>;
-}
-
let Predicates = [HasBWI] in {
def : Pat<(v32i1 (insert_subvector (v32i1 immAllZerosV),
(v16i1 VK16:$mask), (iPTR 0))),
Modified: llvm/trunk/test/Analysis/CostModel/X86/cast.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/X86/cast.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/Analysis/CostModel/X86/cast.ll (original)
+++ llvm/trunk/test/Analysis/CostModel/X86/cast.ll Sun Jan 7 10:20:37 2018
@@ -8,11 +8,17 @@ target triple = "x86_64-apple-macosx10.8
define i32 @add(i32 %arg) {
; CHECK-LABEL: for function 'add'
; -- Same size registeres --
- ;CHECK: cost of 1 {{.*}} zext
+ ;CHECK-AVX512: cost of 12 {{.*}} zext
+ ;CHECK-AVX2: cost of 1 {{.*}} zext
+ ;CHECK-AVX: cost of 1 {{.*}} zext
%A = zext <4 x i1> undef to <4 x i32>
- ;CHECK: cost of 2 {{.*}} sext
+ ;CHECK-AVX512: cost of 12 {{.*}} sext
+ ;CHECK-AVX2: cost of 2 {{.*}} sext
+ ;CHECK-AVX: cost of 2 {{.*}} sext
%B = sext <4 x i1> undef to <4 x i32>
- ;CHECK: cost of 0 {{.*}} trunc
+ ;CHECK-AVX512: cost of 0 {{.*}} trunc
+ ;CHECK-AVX2: cost of 0 {{.*}} trunc
+ ;CHECK-AVX: cost of 0 {{.*}} trunc
%C = trunc <4 x i32> undef to <4 x i1>
; -- Different size registers --
Modified: llvm/trunk/test/CodeGen/X86/avx512-cvt.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-cvt.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512-cvt.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-cvt.ll Sun Jan 7 10:20:37 2018
@@ -702,9 +702,10 @@ define <4 x float> @f64to4f32_mask(<4 x
; NOVL-LABEL: f64to4f32_mask:
; NOVL: # %bb.0:
; NOVL-NEXT: vpslld $31, %xmm1, %xmm1
-; NOVL-NEXT: vpsrad $31, %xmm1, %xmm1
+; NOVL-NEXT: vptestmd %zmm1, %zmm1, %k1
; NOVL-NEXT: vcvtpd2ps %ymm0, %xmm0
-; NOVL-NEXT: vpand %xmm0, %xmm1, %xmm0
+; NOVL-NEXT: vmovaps %zmm0, %zmm0 {%k1} {z}
+; NOVL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NOVL-NEXT: vzeroupper
; NOVL-NEXT: retq
;
@@ -743,9 +744,12 @@ define <8 x double> @f32to8f64(<8 x floa
define <4 x double> @f32to4f64_mask(<4 x float> %b, <4 x double> %b1, <4 x double> %a1) {
; NOVL-LABEL: f32to4f64_mask:
; NOVL: # %bb.0:
+; NOVL-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
+; NOVL-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
; NOVL-NEXT: vcvtps2pd %xmm0, %ymm0
-; NOVL-NEXT: vcmpltpd %ymm2, %ymm1, %ymm1
-; NOVL-NEXT: vandpd %ymm0, %ymm1, %ymm0
+; NOVL-NEXT: vcmpltpd %zmm2, %zmm1, %k1
+; NOVL-NEXT: vmovapd %zmm0, %zmm0 {%k1} {z}
+; NOVL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NOVL-NEXT: retq
;
; VL-LABEL: f32to4f64_mask:
@@ -1591,12 +1595,15 @@ define <8 x float> @sbto8f32(<8 x float>
}
define <4 x float> @sbto4f32(<4 x float> %a) {
-; NOVL-LABEL: sbto4f32:
-; NOVL: # %bb.0:
-; NOVL-NEXT: vxorps %xmm1, %xmm1, %xmm1
-; NOVL-NEXT: vcmpltps %xmm0, %xmm1, %xmm0
-; NOVL-NEXT: vcvtdq2ps %xmm0, %xmm0
-; NOVL-NEXT: retq
+; NOVLDQ-LABEL: sbto4f32:
+; NOVLDQ: # %bb.0:
+; NOVLDQ-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NOVLDQ-NEXT: vxorps %xmm1, %xmm1, %xmm1
+; NOVLDQ-NEXT: vcmpltps %zmm0, %zmm1, %k1
+; NOVLDQ-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; NOVLDQ-NEXT: vcvtdq2ps %xmm0, %xmm0
+; NOVLDQ-NEXT: vzeroupper
+; NOVLDQ-NEXT: retq
;
; VLDQ-LABEL: sbto4f32:
; VLDQ: # %bb.0:
@@ -1614,19 +1621,30 @@ define <4 x float> @sbto4f32(<4 x float>
; VLNODQ-NEXT: vmovdqa32 %xmm0, %xmm0 {%k1} {z}
; VLNODQ-NEXT: vcvtdq2ps %xmm0, %xmm0
; VLNODQ-NEXT: retq
+;
+; AVX512DQ-LABEL: sbto4f32:
+; AVX512DQ: # %bb.0:
+; AVX512DQ-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1
+; AVX512DQ-NEXT: vcmpltps %zmm0, %zmm1, %k0
+; AVX512DQ-NEXT: vpmovm2d %k0, %zmm0
+; AVX512DQ-NEXT: vcvtdq2ps %xmm0, %xmm0
+; AVX512DQ-NEXT: vzeroupper
+; AVX512DQ-NEXT: retq
%cmpres = fcmp ogt <4 x float> %a, zeroinitializer
%1 = sitofp <4 x i1> %cmpres to <4 x float>
ret <4 x float> %1
}
define <4 x double> @sbto4f64(<4 x double> %a) {
-; NOVL-LABEL: sbto4f64:
-; NOVL: # %bb.0:
-; NOVL-NEXT: vxorpd %xmm1, %xmm1, %xmm1
-; NOVL-NEXT: vcmpltpd %ymm0, %ymm1, %ymm0
-; NOVL-NEXT: vpmovqd %zmm0, %ymm0
-; NOVL-NEXT: vcvtdq2pd %xmm0, %ymm0
-; NOVL-NEXT: retq
+; NOVLDQ-LABEL: sbto4f64:
+; NOVLDQ: # %bb.0:
+; NOVLDQ-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NOVLDQ-NEXT: vxorpd %xmm1, %xmm1, %xmm1
+; NOVLDQ-NEXT: vcmpltpd %zmm0, %zmm1, %k1
+; NOVLDQ-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; NOVLDQ-NEXT: vcvtdq2pd %xmm0, %ymm0
+; NOVLDQ-NEXT: retq
;
; VLDQ-LABEL: sbto4f64:
; VLDQ: # %bb.0:
@@ -1644,18 +1662,30 @@ define <4 x double> @sbto4f64(<4 x doubl
; VLNODQ-NEXT: vmovdqa32 %xmm0, %xmm0 {%k1} {z}
; VLNODQ-NEXT: vcvtdq2pd %xmm0, %ymm0
; VLNODQ-NEXT: retq
+;
+; AVX512DQ-LABEL: sbto4f64:
+; AVX512DQ: # %bb.0:
+; AVX512DQ-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; AVX512DQ-NEXT: vxorpd %xmm1, %xmm1, %xmm1
+; AVX512DQ-NEXT: vcmpltpd %zmm0, %zmm1, %k0
+; AVX512DQ-NEXT: vpmovm2d %k0, %zmm0
+; AVX512DQ-NEXT: vcvtdq2pd %xmm0, %ymm0
+; AVX512DQ-NEXT: retq
%cmpres = fcmp ogt <4 x double> %a, zeroinitializer
%1 = sitofp <4 x i1> %cmpres to <4 x double>
ret <4 x double> %1
}
define <2 x float> @sbto2f32(<2 x float> %a) {
-; NOVL-LABEL: sbto2f32:
-; NOVL: # %bb.0:
-; NOVL-NEXT: vxorps %xmm1, %xmm1, %xmm1
-; NOVL-NEXT: vcmpltps %xmm0, %xmm1, %xmm0
-; NOVL-NEXT: vcvtdq2ps %xmm0, %xmm0
-; NOVL-NEXT: retq
+; NOVLDQ-LABEL: sbto2f32:
+; NOVLDQ: # %bb.0:
+; NOVLDQ-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NOVLDQ-NEXT: vxorps %xmm1, %xmm1, %xmm1
+; NOVLDQ-NEXT: vcmpltps %zmm0, %zmm1, %k1
+; NOVLDQ-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; NOVLDQ-NEXT: vcvtdq2ps %xmm0, %xmm0
+; NOVLDQ-NEXT: vzeroupper
+; NOVLDQ-NEXT: retq
;
; VLDQ-LABEL: sbto2f32:
; VLDQ: # %bb.0:
@@ -1673,19 +1703,31 @@ define <2 x float> @sbto2f32(<2 x float>
; VLNODQ-NEXT: vmovdqa32 %xmm0, %xmm0 {%k1} {z}
; VLNODQ-NEXT: vcvtdq2ps %xmm0, %xmm0
; VLNODQ-NEXT: retq
+;
+; AVX512DQ-LABEL: sbto2f32:
+; AVX512DQ: # %bb.0:
+; AVX512DQ-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1
+; AVX512DQ-NEXT: vcmpltps %zmm0, %zmm1, %k0
+; AVX512DQ-NEXT: vpmovm2d %k0, %zmm0
+; AVX512DQ-NEXT: vcvtdq2ps %xmm0, %xmm0
+; AVX512DQ-NEXT: vzeroupper
+; AVX512DQ-NEXT: retq
%cmpres = fcmp ogt <2 x float> %a, zeroinitializer
%1 = sitofp <2 x i1> %cmpres to <2 x float>
ret <2 x float> %1
}
define <2 x double> @sbto2f64(<2 x double> %a) {
-; NOVL-LABEL: sbto2f64:
-; NOVL: # %bb.0:
-; NOVL-NEXT: vxorpd %xmm1, %xmm1, %xmm1
-; NOVL-NEXT: vcmpltpd %xmm0, %xmm1, %xmm0
-; NOVL-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3]
-; NOVL-NEXT: vcvtdq2pd %xmm0, %xmm0
-; NOVL-NEXT: retq
+; NOVLDQ-LABEL: sbto2f64:
+; NOVLDQ: # %bb.0:
+; NOVLDQ-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NOVLDQ-NEXT: vxorpd %xmm1, %xmm1, %xmm1
+; NOVLDQ-NEXT: vcmpltpd %zmm0, %zmm1, %k1
+; NOVLDQ-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; NOVLDQ-NEXT: vcvtdq2pd %xmm0, %xmm0
+; NOVLDQ-NEXT: vzeroupper
+; NOVLDQ-NEXT: retq
;
; VLDQ-LABEL: sbto2f64:
; VLDQ: # %bb.0:
@@ -1703,6 +1745,16 @@ define <2 x double> @sbto2f64(<2 x doubl
; VLNODQ-NEXT: vmovdqa32 %xmm0, %xmm0 {%k1} {z}
; VLNODQ-NEXT: vcvtdq2pd %xmm0, %xmm0
; VLNODQ-NEXT: retq
+;
+; AVX512DQ-LABEL: sbto2f64:
+; AVX512DQ: # %bb.0:
+; AVX512DQ-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512DQ-NEXT: vxorpd %xmm1, %xmm1, %xmm1
+; AVX512DQ-NEXT: vcmpltpd %zmm0, %zmm1, %k0
+; AVX512DQ-NEXT: vpmovm2d %k0, %zmm0
+; AVX512DQ-NEXT: vcvtdq2pd %xmm0, %xmm0
+; AVX512DQ-NEXT: vzeroupper
+; AVX512DQ-NEXT: retq
%cmpres = fcmp ogt <2 x double> %a, zeroinitializer
%1 = sitofp <2 x i1> %cmpres to <2 x double>
ret <2 x double> %1
@@ -1925,10 +1977,12 @@ define <8 x double> @ubto8f64(<8 x i32>
define <4 x float> @ubto4f32(<4 x i32> %a) {
; NOVL-LABEL: ubto4f32:
; NOVL: # %bb.0:
+; NOVL-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NOVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
-; NOVL-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NOVL-NEXT: vpbroadcastd {{.*#+}} xmm1 = [1,1,1,1]
-; NOVL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; NOVL-NEXT: vpcmpgtd %zmm0, %zmm1, %k1
+; NOVL-NEXT: vpbroadcastd {{.*}}(%rip), %zmm0 {%k1} {z}
+; NOVL-NEXT: vcvtdq2ps %xmm0, %xmm0
+; NOVL-NEXT: vzeroupper
; NOVL-NEXT: retq
;
; VL-LABEL: ubto4f32:
@@ -1946,9 +2000,10 @@ define <4 x float> @ubto4f32(<4 x i32> %
define <4 x double> @ubto4f64(<4 x i32> %a) {
; NOVL-LABEL: ubto4f64:
; NOVL: # %bb.0:
+; NOVL-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NOVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
-; NOVL-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NOVL-NEXT: vpsrld $31, %xmm0, %xmm0
+; NOVL-NEXT: vpcmpgtd %zmm0, %zmm1, %k1
+; NOVL-NEXT: vpbroadcastd {{.*}}(%rip), %zmm0 {%k1} {z}
; NOVL-NEXT: vcvtdq2pd %xmm0, %ymm0
; NOVL-NEXT: retq
;
@@ -1969,14 +2024,10 @@ define <2 x float> @ubto2f32(<2 x i32> %
; NOVL: # %bb.0:
; NOVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
; NOVL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]
-; NOVL-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NOVL-NEXT: vpextrb $8, %xmm0, %eax
-; NOVL-NEXT: andl $1, %eax
-; NOVL-NEXT: vcvtsi2ssl %eax, %xmm2, %xmm1
-; NOVL-NEXT: vpextrb $0, %xmm0, %eax
-; NOVL-NEXT: andl $1, %eax
-; NOVL-NEXT: vcvtsi2ssl %eax, %xmm2, %xmm0
-; NOVL-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[2,3]
+; NOVL-NEXT: vpcmpltuq %zmm1, %zmm0, %k1
+; NOVL-NEXT: vpbroadcastd {{.*}}(%rip), %zmm0 {%k1} {z}
+; NOVL-NEXT: vcvtdq2ps %xmm0, %xmm0
+; NOVL-NEXT: vzeroupper
; NOVL-NEXT: retq
;
; VL-LABEL: ubto2f32:
@@ -1997,10 +2048,8 @@ define <2 x double> @ubto2f64(<2 x i32>
; NOVL: # %bb.0:
; NOVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
; NOVL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]
-; NOVL-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NOVL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
-; NOVL-NEXT: vpbroadcastd {{.*#+}} xmm1 = [1,1,1,1]
-; NOVL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; NOVL-NEXT: vpcmpltuq %zmm1, %zmm0, %k1
+; NOVL-NEXT: vpbroadcastd {{.*}}(%rip), %zmm0 {%k1} {z}
; NOVL-NEXT: vcvtudq2pd %ymm0, %zmm0
; NOVL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NOVL-NEXT: vzeroupper
@@ -2020,19 +2069,27 @@ define <2 x double> @ubto2f64(<2 x i32>
}
define <2 x i64> @test_2f64toub(<2 x double> %a, <2 x i64> %passthru) {
-; NOVLDQ-LABEL: test_2f64toub:
-; NOVLDQ: # %bb.0:
-; NOVLDQ-NEXT: vcvttsd2usi %xmm0, %rax
-; NOVLDQ-NEXT: vmovq %rax, %xmm2
-; NOVLDQ-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
-; NOVLDQ-NEXT: vcvttsd2usi %xmm0, %rax
-; NOVLDQ-NEXT: vmovq %rax, %xmm0
-; NOVLDQ-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm2[0],xmm0[0]
-; NOVLDQ-NEXT: vpsllq $63, %xmm0, %xmm0
-; NOVLDQ-NEXT: vpsraq $63, %zmm0, %zmm0
-; NOVLDQ-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NOVLDQ-NEXT: vzeroupper
-; NOVLDQ-NEXT: retq
+; KNL-LABEL: test_2f64toub:
+; KNL: # %bb.0:
+; KNL-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
+; KNL-NEXT: vcvttsd2si %xmm2, %eax
+; KNL-NEXT: kmovw %eax, %k0
+; KNL-NEXT: vcvttsd2si %xmm0, %eax
+; KNL-NEXT: andl $1, %eax
+; KNL-NEXT: kmovw %eax, %k1
+; KNL-NEXT: kshiftrw $1, %k0, %k2
+; KNL-NEXT: kshiftlw $1, %k2, %k2
+; KNL-NEXT: korw %k1, %k2, %k1
+; KNL-NEXT: kshiftrw $1, %k1, %k2
+; KNL-NEXT: kxorw %k0, %k2, %k0
+; KNL-NEXT: kshiftlw $15, %k0, %k0
+; KNL-NEXT: kshiftrw $14, %k0, %k0
+; KNL-NEXT: kxorw %k1, %k0, %k1
+; KNL-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
+; KNL-NEXT: retq
;
; VL-LABEL: test_2f64toub:
; VL: # %bb.0:
@@ -2044,13 +2101,47 @@ define <2 x i64> @test_2f64toub(<2 x dou
;
; AVX512DQ-LABEL: test_2f64toub:
; AVX512DQ: # %bb.0:
-; AVX512DQ-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
-; AVX512DQ-NEXT: vcvttpd2uqq %zmm0, %zmm0
-; AVX512DQ-NEXT: vpsllq $63, %xmm0, %xmm0
-; AVX512DQ-NEXT: vpsraq $63, %zmm0, %zmm0
-; AVX512DQ-NEXT: vpand %xmm1, %xmm0, %xmm0
+; AVX512DQ-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512DQ-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
+; AVX512DQ-NEXT: vcvttsd2si %xmm2, %eax
+; AVX512DQ-NEXT: kmovw %eax, %k0
+; AVX512DQ-NEXT: vcvttsd2si %xmm0, %eax
+; AVX512DQ-NEXT: andl $1, %eax
+; AVX512DQ-NEXT: kmovw %eax, %k1
+; AVX512DQ-NEXT: kshiftrw $1, %k0, %k2
+; AVX512DQ-NEXT: kshiftlw $1, %k2, %k2
+; AVX512DQ-NEXT: korw %k1, %k2, %k1
+; AVX512DQ-NEXT: kshiftrw $1, %k1, %k2
+; AVX512DQ-NEXT: kxorw %k0, %k2, %k0
+; AVX512DQ-NEXT: kshiftlw $15, %k0, %k0
+; AVX512DQ-NEXT: kshiftrw $14, %k0, %k0
+; AVX512DQ-NEXT: kxorw %k1, %k0, %k1
+; AVX512DQ-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; AVX512DQ-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; AVX512DQ-NEXT: vzeroupper
; AVX512DQ-NEXT: retq
+;
+; AVX512BW-LABEL: test_2f64toub:
+; AVX512BW: # %bb.0:
+; AVX512BW-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512BW-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
+; AVX512BW-NEXT: vcvttsd2si %xmm2, %eax
+; AVX512BW-NEXT: kmovd %eax, %k0
+; AVX512BW-NEXT: vcvttsd2si %xmm0, %eax
+; AVX512BW-NEXT: andl $1, %eax
+; AVX512BW-NEXT: kmovw %eax, %k1
+; AVX512BW-NEXT: kshiftrw $1, %k0, %k2
+; AVX512BW-NEXT: kshiftlw $1, %k2, %k2
+; AVX512BW-NEXT: korw %k1, %k2, %k1
+; AVX512BW-NEXT: kshiftrw $1, %k1, %k2
+; AVX512BW-NEXT: kxorw %k0, %k2, %k0
+; AVX512BW-NEXT: kshiftlw $15, %k0, %k0
+; AVX512BW-NEXT: kshiftrw $14, %k0, %k0
+; AVX512BW-NEXT: kxorw %k1, %k0, %k1
+; AVX512BW-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; AVX512BW-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512BW-NEXT: vzeroupper
+; AVX512BW-NEXT: retq
%mask = fptoui <2 x double> %a to <2 x i1>
%select = select <2 x i1> %mask, <2 x i64> %passthru, <2 x i64> zeroinitializer
ret <2 x i64> %select
@@ -2059,12 +2150,12 @@ define <2 x i64> @test_2f64toub(<2 x dou
define <4 x i64> @test_4f64toub(<4 x double> %a, <4 x i64> %passthru) {
; NOVL-LABEL: test_4f64toub:
; NOVL: # %bb.0:
-; NOVL-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
-; NOVL-NEXT: vcvttpd2udq %zmm0, %ymm0
+; NOVL-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NOVL-NEXT: vcvttpd2dq %ymm0, %xmm0
; NOVL-NEXT: vpslld $31, %xmm0, %xmm0
-; NOVL-NEXT: vpsrad $31, %xmm0, %xmm0
-; NOVL-NEXT: vpmovsxdq %xmm0, %ymm0
-; NOVL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; NOVL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; NOVL-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; NOVL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NOVL-NEXT: retq
;
; VL-LABEL: test_4f64toub:
@@ -2101,19 +2192,16 @@ define <8 x i64> @test_8f64toub(<8 x dou
}
define <2 x i64> @test_2f32toub(<2 x float> %a, <2 x i64> %passthru) {
-; NOVLDQ-LABEL: test_2f32toub:
-; NOVLDQ: # %bb.0:
-; NOVLDQ-NEXT: vcvttss2usi %xmm0, %rax
-; NOVLDQ-NEXT: vmovq %rax, %xmm2
-; NOVLDQ-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]
-; NOVLDQ-NEXT: vcvttss2usi %xmm0, %rax
-; NOVLDQ-NEXT: vmovq %rax, %xmm0
-; NOVLDQ-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm2[0],xmm0[0]
-; NOVLDQ-NEXT: vpsllq $63, %xmm0, %xmm0
-; NOVLDQ-NEXT: vpsraq $63, %zmm0, %zmm0
-; NOVLDQ-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NOVLDQ-NEXT: vzeroupper
-; NOVLDQ-NEXT: retq
+; NOVL-LABEL: test_2f32toub:
+; NOVL: # %bb.0:
+; NOVL-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NOVL-NEXT: vcvttps2dq %xmm0, %xmm0
+; NOVL-NEXT: vpslld $31, %xmm0, %xmm0
+; NOVL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; NOVL-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; NOVL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; NOVL-NEXT: vzeroupper
+; NOVL-NEXT: retq
;
; VL-LABEL: test_2f32toub:
; VL: # %bb.0:
@@ -2122,16 +2210,6 @@ define <2 x i64> @test_2f32toub(<2 x flo
; VL-NEXT: vptestmd %xmm0, %xmm0, %k1
; VL-NEXT: vmovdqa64 %xmm1, %xmm0 {%k1} {z}
; VL-NEXT: retq
-;
-; AVX512DQ-LABEL: test_2f32toub:
-; AVX512DQ: # %bb.0:
-; AVX512DQ-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
-; AVX512DQ-NEXT: vcvttps2uqq %ymm0, %zmm0
-; AVX512DQ-NEXT: vpsllq $63, %xmm0, %xmm0
-; AVX512DQ-NEXT: vpsraq $63, %zmm0, %zmm0
-; AVX512DQ-NEXT: vpand %xmm1, %xmm0, %xmm0
-; AVX512DQ-NEXT: vzeroupper
-; AVX512DQ-NEXT: retq
%mask = fptoui <2 x float> %a to <2 x i1>
%select = select <2 x i1> %mask, <2 x i64> %passthru, <2 x i64> zeroinitializer
ret <2 x i64> %select
@@ -2140,12 +2218,12 @@ define <2 x i64> @test_2f32toub(<2 x flo
define <4 x i64> @test_4f32toub(<4 x float> %a, <4 x i64> %passthru) {
; NOVL-LABEL: test_4f32toub:
; NOVL: # %bb.0:
-; NOVL-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
-; NOVL-NEXT: vcvttps2udq %zmm0, %zmm0
+; NOVL-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NOVL-NEXT: vcvttps2dq %xmm0, %xmm0
; NOVL-NEXT: vpslld $31, %xmm0, %xmm0
-; NOVL-NEXT: vpsrad $31, %xmm0, %xmm0
-; NOVL-NEXT: vpmovsxdq %xmm0, %ymm0
-; NOVL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; NOVL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; NOVL-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; NOVL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NOVL-NEXT: retq
;
; VL-LABEL: test_4f32toub:
@@ -2195,16 +2273,27 @@ define <16 x i32> @test_16f32toub(<16 x
}
define <2 x i64> @test_2f64tosb(<2 x double> %a, <2 x i64> %passthru) {
-; NOVLDQ-LABEL: test_2f64tosb:
-; NOVLDQ: # %bb.0:
-; NOVLDQ-NEXT: vcvttsd2si %xmm0, %rax
-; NOVLDQ-NEXT: vmovq %rax, %xmm2
-; NOVLDQ-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
-; NOVLDQ-NEXT: vcvttsd2si %xmm0, %rax
-; NOVLDQ-NEXT: vmovq %rax, %xmm0
-; NOVLDQ-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm2[0],xmm0[0]
-; NOVLDQ-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NOVLDQ-NEXT: retq
+; KNL-LABEL: test_2f64tosb:
+; KNL: # %bb.0:
+; KNL-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
+; KNL-NEXT: vcvttsd2si %xmm2, %eax
+; KNL-NEXT: kmovw %eax, %k0
+; KNL-NEXT: vcvttsd2si %xmm0, %eax
+; KNL-NEXT: andl $1, %eax
+; KNL-NEXT: kmovw %eax, %k1
+; KNL-NEXT: kshiftrw $1, %k0, %k2
+; KNL-NEXT: kshiftlw $1, %k2, %k2
+; KNL-NEXT: korw %k1, %k2, %k1
+; KNL-NEXT: kshiftrw $1, %k1, %k2
+; KNL-NEXT: kxorw %k0, %k2, %k0
+; KNL-NEXT: kshiftlw $15, %k0, %k0
+; KNL-NEXT: kshiftrw $14, %k0, %k0
+; KNL-NEXT: kxorw %k1, %k0, %k1
+; KNL-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
+; KNL-NEXT: retq
;
; VL-LABEL: test_2f64tosb:
; VL: # %bb.0:
@@ -2216,11 +2305,47 @@ define <2 x i64> @test_2f64tosb(<2 x dou
;
; AVX512DQ-LABEL: test_2f64tosb:
; AVX512DQ: # %bb.0:
-; AVX512DQ-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
-; AVX512DQ-NEXT: vcvttpd2qq %zmm0, %zmm0
-; AVX512DQ-NEXT: vandps %xmm1, %xmm0, %xmm0
+; AVX512DQ-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512DQ-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
+; AVX512DQ-NEXT: vcvttsd2si %xmm2, %eax
+; AVX512DQ-NEXT: kmovw %eax, %k0
+; AVX512DQ-NEXT: vcvttsd2si %xmm0, %eax
+; AVX512DQ-NEXT: andl $1, %eax
+; AVX512DQ-NEXT: kmovw %eax, %k1
+; AVX512DQ-NEXT: kshiftrw $1, %k0, %k2
+; AVX512DQ-NEXT: kshiftlw $1, %k2, %k2
+; AVX512DQ-NEXT: korw %k1, %k2, %k1
+; AVX512DQ-NEXT: kshiftrw $1, %k1, %k2
+; AVX512DQ-NEXT: kxorw %k0, %k2, %k0
+; AVX512DQ-NEXT: kshiftlw $15, %k0, %k0
+; AVX512DQ-NEXT: kshiftrw $14, %k0, %k0
+; AVX512DQ-NEXT: kxorw %k1, %k0, %k1
+; AVX512DQ-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; AVX512DQ-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; AVX512DQ-NEXT: vzeroupper
; AVX512DQ-NEXT: retq
+;
+; AVX512BW-LABEL: test_2f64tosb:
+; AVX512BW: # %bb.0:
+; AVX512BW-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512BW-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
+; AVX512BW-NEXT: vcvttsd2si %xmm2, %eax
+; AVX512BW-NEXT: kmovd %eax, %k0
+; AVX512BW-NEXT: vcvttsd2si %xmm0, %eax
+; AVX512BW-NEXT: andl $1, %eax
+; AVX512BW-NEXT: kmovw %eax, %k1
+; AVX512BW-NEXT: kshiftrw $1, %k0, %k2
+; AVX512BW-NEXT: kshiftlw $1, %k2, %k2
+; AVX512BW-NEXT: korw %k1, %k2, %k1
+; AVX512BW-NEXT: kshiftrw $1, %k1, %k2
+; AVX512BW-NEXT: kxorw %k0, %k2, %k0
+; AVX512BW-NEXT: kshiftlw $15, %k0, %k0
+; AVX512BW-NEXT: kshiftrw $14, %k0, %k0
+; AVX512BW-NEXT: kxorw %k1, %k0, %k1
+; AVX512BW-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; AVX512BW-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512BW-NEXT: vzeroupper
+; AVX512BW-NEXT: retq
%mask = fptosi <2 x double> %a to <2 x i1>
%select = select <2 x i1> %mask, <2 x i64> %passthru, <2 x i64> zeroinitializer
ret <2 x i64> %select
@@ -2229,9 +2354,11 @@ define <2 x i64> @test_2f64tosb(<2 x dou
define <4 x i64> @test_4f64tosb(<4 x double> %a, <4 x i64> %passthru) {
; NOVL-LABEL: test_4f64tosb:
; NOVL: # %bb.0:
+; NOVL-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
; NOVL-NEXT: vcvttpd2dq %ymm0, %xmm0
-; NOVL-NEXT: vpmovsxdq %xmm0, %ymm0
-; NOVL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; NOVL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; NOVL-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; NOVL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NOVL-NEXT: retq
;
; VL-LABEL: test_4f64tosb:
@@ -2265,16 +2392,15 @@ define <8 x i64> @test_8f64tosb(<8 x dou
}
define <2 x i64> @test_2f32tosb(<2 x float> %a, <2 x i64> %passthru) {
-; NOVLDQ-LABEL: test_2f32tosb:
-; NOVLDQ: # %bb.0:
-; NOVLDQ-NEXT: vcvttss2si %xmm0, %rax
-; NOVLDQ-NEXT: vmovq %rax, %xmm2
-; NOVLDQ-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]
-; NOVLDQ-NEXT: vcvttss2si %xmm0, %rax
-; NOVLDQ-NEXT: vmovq %rax, %xmm0
-; NOVLDQ-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm2[0],xmm0[0]
-; NOVLDQ-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NOVLDQ-NEXT: retq
+; NOVL-LABEL: test_2f32tosb:
+; NOVL: # %bb.0:
+; NOVL-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NOVL-NEXT: vcvttps2dq %xmm0, %xmm0
+; NOVL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; NOVL-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; NOVL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; NOVL-NEXT: vzeroupper
+; NOVL-NEXT: retq
;
; VL-LABEL: test_2f32tosb:
; VL: # %bb.0:
@@ -2282,14 +2408,6 @@ define <2 x i64> @test_2f32tosb(<2 x flo
; VL-NEXT: vptestmd %xmm0, %xmm0, %k1
; VL-NEXT: vmovdqa64 %xmm1, %xmm0 {%k1} {z}
; VL-NEXT: retq
-;
-; AVX512DQ-LABEL: test_2f32tosb:
-; AVX512DQ: # %bb.0:
-; AVX512DQ-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
-; AVX512DQ-NEXT: vcvttps2qq %ymm0, %zmm0
-; AVX512DQ-NEXT: vandps %xmm1, %xmm0, %xmm0
-; AVX512DQ-NEXT: vzeroupper
-; AVX512DQ-NEXT: retq
%mask = fptosi <2 x float> %a to <2 x i1>
%select = select <2 x i1> %mask, <2 x i64> %passthru, <2 x i64> zeroinitializer
ret <2 x i64> %select
@@ -2298,9 +2416,11 @@ define <2 x i64> @test_2f32tosb(<2 x flo
define <4 x i64> @test_4f32tosb(<4 x float> %a, <4 x i64> %passthru) {
; NOVL-LABEL: test_4f32tosb:
; NOVL: # %bb.0:
+; NOVL-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
; NOVL-NEXT: vcvttps2dq %xmm0, %xmm0
-; NOVL-NEXT: vpmovsxdq %xmm0, %ymm0
-; NOVL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; NOVL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; NOVL-NEXT: vmovdqa64 %zmm1, %zmm0 {%k1} {z}
+; NOVL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NOVL-NEXT: retq
;
; VL-LABEL: test_4f32tosb:
Modified: llvm/trunk/test/CodeGen/X86/avx512-ext.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-ext.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512-ext.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-ext.ll Sun Jan 7 10:20:37 2018
@@ -301,9 +301,10 @@ define <4 x i32> @zext_4x8mem_to_4x32(<4
; KNL-LABEL: zext_4x8mem_to_4x32:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovzxbd {{.*#+}} xmm1 = mem[0],zero,zero,zero,mem[1],zero,zero,zero,mem[2],zero,zero,zero,mem[3],zero,zero,zero
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovzxbd {{.*#+}} xmm0 = mem[0],zero,zero,zero,mem[1],zero,zero,zero,mem[2],zero,zero,zero,mem[3],zero,zero,zero
+; KNL-NEXT: vmovdqa32 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: zext_4x8mem_to_4x32:
@@ -322,9 +323,10 @@ define <4 x i32> @sext_4x8mem_to_4x32(<4
; KNL-LABEL: sext_4x8mem_to_4x32:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovsxbd (%rdi), %xmm1
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovsxbd (%rdi), %xmm0
+; KNL-NEXT: vmovdqa32 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: sext_4x8mem_to_4x32:
@@ -489,9 +491,10 @@ define <2 x i64> @zext_2x8mem_to_2x64(<2
; KNL-LABEL: zext_2x8mem_to_2x64:
; KNL: # %bb.0:
; KNL-NEXT: vpsllq $63, %xmm0, %xmm0
-; KNL-NEXT: vpsraq $63, %zmm0, %zmm0
-; KNL-NEXT: vpmovzxbq {{.*#+}} xmm1 = mem[0],zero,zero,zero,zero,zero,zero,zero,mem[1],zero,zero,zero,zero,zero,zero,zero
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmq %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovzxbq {{.*#+}} xmm0 = mem[0],zero,zero,zero,zero,zero,zero,zero,mem[1],zero,zero,zero,zero,zero,zero,zero
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: zext_2x8mem_to_2x64:
@@ -509,9 +512,10 @@ define <2 x i64> @sext_2x8mem_to_2x64mas
; KNL-LABEL: sext_2x8mem_to_2x64mask:
; KNL: # %bb.0:
; KNL-NEXT: vpsllq $63, %xmm0, %xmm0
-; KNL-NEXT: vpsraq $63, %zmm0, %zmm0
-; KNL-NEXT: vpmovsxbq (%rdi), %xmm1
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmq %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovsxbq (%rdi), %xmm0
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: sext_2x8mem_to_2x64mask:
@@ -539,10 +543,10 @@ define <4 x i64> @zext_4x8mem_to_4x64(<4
; KNL-LABEL: zext_4x8mem_to_4x64:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovzxdq {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
-; KNL-NEXT: vpmovzxbq {{.*#+}} ymm1 = mem[0],zero,zero,zero,zero,zero,zero,zero,mem[1],zero,zero,zero,zero,zero,zero,zero,mem[2],zero,zero,zero,zero,zero,zero,zero,mem[3],zero,zero,zero,zero,zero,zero,zero
-; KNL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovzxbq {{.*#+}} ymm0 = mem[0],zero,zero,zero,zero,zero,zero,zero,mem[1],zero,zero,zero,zero,zero,zero,zero,mem[2],zero,zero,zero,zero,zero,zero,zero,mem[3],zero,zero,zero,zero,zero,zero,zero
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: zext_4x8mem_to_4x64:
@@ -561,10 +565,10 @@ define <4 x i64> @sext_4x8mem_to_4x64mas
; KNL-LABEL: sext_4x8mem_to_4x64mask:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovsxdq %xmm0, %ymm0
-; KNL-NEXT: vpmovsxbq (%rdi), %ymm1
-; KNL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovsxbq (%rdi), %ymm0
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: sext_4x8mem_to_4x64mask:
@@ -645,9 +649,10 @@ define <4 x i32> @zext_4x16mem_to_4x32(<
; KNL-LABEL: zext_4x16mem_to_4x32:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovzxwd {{.*#+}} xmm1 = mem[0],zero,mem[1],zero,mem[2],zero,mem[3],zero
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovzxwd {{.*#+}} xmm0 = mem[0],zero,mem[1],zero,mem[2],zero,mem[3],zero
+; KNL-NEXT: vmovdqa32 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: zext_4x16mem_to_4x32:
@@ -666,9 +671,10 @@ define <4 x i32> @sext_4x16mem_to_4x32ma
; KNL-LABEL: sext_4x16mem_to_4x32mask:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovsxwd (%rdi), %xmm1
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovsxwd (%rdi), %xmm0
+; KNL-NEXT: vmovdqa32 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: sext_4x16mem_to_4x32mask:
@@ -865,9 +871,10 @@ define <2 x i64> @zext_2x16mem_to_2x64(<
; KNL-LABEL: zext_2x16mem_to_2x64:
; KNL: # %bb.0:
; KNL-NEXT: vpsllq $63, %xmm0, %xmm0
-; KNL-NEXT: vpsraq $63, %zmm0, %zmm0
-; KNL-NEXT: vpmovzxwq {{.*#+}} xmm1 = mem[0],zero,zero,zero,mem[1],zero,zero,zero
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmq %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovzxwq {{.*#+}} xmm0 = mem[0],zero,zero,zero,mem[1],zero,zero,zero
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: zext_2x16mem_to_2x64:
@@ -886,9 +893,10 @@ define <2 x i64> @sext_2x16mem_to_2x64ma
; KNL-LABEL: sext_2x16mem_to_2x64mask:
; KNL: # %bb.0:
; KNL-NEXT: vpsllq $63, %xmm0, %xmm0
-; KNL-NEXT: vpsraq $63, %zmm0, %zmm0
-; KNL-NEXT: vpmovsxwq (%rdi), %xmm1
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmq %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovsxwq (%rdi), %xmm0
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: sext_2x16mem_to_2x64mask:
@@ -917,10 +925,10 @@ define <4 x i64> @zext_4x16mem_to_4x64(<
; KNL-LABEL: zext_4x16mem_to_4x64:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovzxdq {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
-; KNL-NEXT: vpmovzxwq {{.*#+}} ymm1 = mem[0],zero,zero,zero,mem[1],zero,zero,zero,mem[2],zero,zero,zero,mem[3],zero,zero,zero
-; KNL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovzxwq {{.*#+}} ymm0 = mem[0],zero,zero,zero,mem[1],zero,zero,zero,mem[2],zero,zero,zero,mem[3],zero,zero,zero
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: zext_4x16mem_to_4x64:
@@ -939,10 +947,10 @@ define <4 x i64> @sext_4x16mem_to_4x64ma
; KNL-LABEL: sext_4x16mem_to_4x64mask:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovsxdq %xmm0, %ymm0
-; KNL-NEXT: vpmovsxwq (%rdi), %ymm1
-; KNL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovsxwq (%rdi), %ymm0
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: sext_4x16mem_to_4x64mask:
@@ -1052,9 +1060,10 @@ define <2 x i64> @zext_2x32mem_to_2x64(<
; KNL-LABEL: zext_2x32mem_to_2x64:
; KNL: # %bb.0:
; KNL-NEXT: vpsllq $63, %xmm0, %xmm0
-; KNL-NEXT: vpsraq $63, %zmm0, %zmm0
-; KNL-NEXT: vpmovzxdq {{.*#+}} xmm1 = mem[0],zero,mem[1],zero
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmq %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovzxdq {{.*#+}} xmm0 = mem[0],zero,mem[1],zero
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: zext_2x32mem_to_2x64:
@@ -1073,9 +1082,10 @@ define <2 x i64> @sext_2x32mem_to_2x64ma
; KNL-LABEL: sext_2x32mem_to_2x64mask:
; KNL: # %bb.0:
; KNL-NEXT: vpsllq $63, %xmm0, %xmm0
-; KNL-NEXT: vpsraq $63, %zmm0, %zmm0
-; KNL-NEXT: vpmovsxdq (%rdi), %xmm1
-; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
+; KNL-NEXT: vptestmq %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovsxdq (%rdi), %xmm0
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: sext_2x32mem_to_2x64mask:
@@ -1104,10 +1114,10 @@ define <4 x i64> @zext_4x32mem_to_4x64(<
; KNL-LABEL: zext_4x32mem_to_4x64:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovzxdq {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
-; KNL-NEXT: vpmovzxdq {{.*#+}} ymm1 = mem[0],zero,mem[1],zero,mem[2],zero,mem[3],zero
-; KNL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovzxdq {{.*#+}} ymm0 = mem[0],zero,mem[1],zero,mem[2],zero,mem[3],zero
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: zext_4x32mem_to_4x64:
@@ -1126,10 +1136,10 @@ define <4 x i64> @sext_4x32mem_to_4x64ma
; KNL-LABEL: sext_4x32mem_to_4x64mask:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm0, %xmm0
-; KNL-NEXT: vpsrad $31, %xmm0, %xmm0
-; KNL-NEXT: vpmovsxdq %xmm0, %ymm0
-; KNL-NEXT: vpmovsxdq (%rdi), %ymm1
-; KNL-NEXT: vpand %ymm1, %ymm0, %ymm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpmovsxdq (%rdi), %ymm0
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: sext_4x32mem_to_4x64mask:
@@ -1167,10 +1177,10 @@ define <4 x i64> @zext_4x32_to_4x64mask(
; KNL-LABEL: zext_4x32_to_4x64mask:
; KNL: # %bb.0:
; KNL-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL-NEXT: vpsrad $31, %xmm1, %xmm1
-; KNL-NEXT: vpmovzxdq {{.*#+}} ymm1 = xmm1[0],zero,xmm1[1],zero,xmm1[2],zero,xmm1[3],zero
+; KNL-NEXT: vptestmd %zmm1, %zmm1, %k1
; KNL-NEXT: vpmovzxdq {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
-; KNL-NEXT: vpand %ymm0, %ymm1, %ymm0
+; KNL-NEXT: vmovdqa64 %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: zext_4x32_to_4x64mask:
Modified: llvm/trunk/test/CodeGen/X86/avx512-insert-extract.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-insert-extract.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512-insert-extract.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-insert-extract.ll Sun Jan 7 10:20:37 2018
@@ -844,40 +844,20 @@ define i32 @test_insertelement_v32i1(i32
define i8 @test_iinsertelement_v4i1(i32 %a, i32 %b, <4 x i32> %x , <4 x i32> %y) {
; KNL-LABEL: test_iinsertelement_v4i1:
; KNL: ## %bb.0:
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: cmpl %esi, %edi
; KNL-NEXT: setb %al
-; KNL-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; KNL-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; KNL-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; KNL-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; KNL-NEXT: vpextrb $4, %xmm0, %ecx
-; KNL-NEXT: kmovw %ecx, %k0
-; KNL-NEXT: vpextrb $0, %xmm0, %ecx
-; KNL-NEXT: andl $1, %ecx
-; KNL-NEXT: kmovw %ecx, %k1
-; KNL-NEXT: kshiftrw $1, %k0, %k2
-; KNL-NEXT: kshiftlw $1, %k2, %k2
-; KNL-NEXT: korw %k1, %k2, %k1
-; KNL-NEXT: kshiftrw $1, %k1, %k2
-; KNL-NEXT: kxorw %k0, %k2, %k0
-; KNL-NEXT: kshiftlw $15, %k0, %k0
-; KNL-NEXT: kshiftrw $14, %k0, %k0
-; KNL-NEXT: kxorw %k1, %k0, %k0
+; KNL-NEXT: vpcmpltud %zmm1, %zmm0, %k0
; KNL-NEXT: kshiftrw $2, %k0, %k1
; KNL-NEXT: kmovw %eax, %k2
; KNL-NEXT: kxorw %k2, %k1, %k1
; KNL-NEXT: kshiftlw $15, %k1, %k1
; KNL-NEXT: kshiftrw $13, %k1, %k1
; KNL-NEXT: kxorw %k0, %k1, %k0
-; KNL-NEXT: kshiftrw $3, %k0, %k1
-; KNL-NEXT: vpextrb $12, %xmm0, %eax
-; KNL-NEXT: kmovw %eax, %k2
-; KNL-NEXT: kxorw %k2, %k1, %k1
-; KNL-NEXT: kshiftlw $15, %k1, %k1
-; KNL-NEXT: kshiftrw $12, %k1, %k1
-; KNL-NEXT: kxorw %k0, %k1, %k0
; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: ## kill: def %al killed %al killed %eax
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test_iinsertelement_v4i1:
@@ -905,18 +885,11 @@ define i8 @test_iinsertelement_v4i1(i32
define i8 @test_iinsertelement_v2i1(i32 %a, i32 %b, <2 x i64> %x , <2 x i64> %y) {
; KNL-LABEL: test_iinsertelement_v2i1:
; KNL: ## %bb.0:
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: cmpl %esi, %edi
; KNL-NEXT: setb %al
-; KNL-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; KNL-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; KNL-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; KNL-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; KNL-NEXT: vpextrb $0, %xmm0, %ecx
-; KNL-NEXT: andl $1, %ecx
-; KNL-NEXT: kmovw %ecx, %k0
-; KNL-NEXT: kshiftrw $1, %k0, %k1
-; KNL-NEXT: kshiftlw $1, %k1, %k1
-; KNL-NEXT: korw %k0, %k1, %k0
+; KNL-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
; KNL-NEXT: kshiftrw $1, %k0, %k1
; KNL-NEXT: kmovw %eax, %k2
; KNL-NEXT: kxorw %k2, %k1, %k1
@@ -925,6 +898,7 @@ define i8 @test_iinsertelement_v2i1(i32
; KNL-NEXT: kxorw %k0, %k1, %k0
; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: ## kill: def %al killed %al killed %eax
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test_iinsertelement_v2i1:
@@ -952,15 +926,15 @@ define i8 @test_iinsertelement_v2i1(i32
define zeroext i8 @test_extractelement_v2i1(<2 x i64> %a, <2 x i64> %b) {
; KNL-LABEL: test_extractelement_v2i1:
; KNL: ## %bb.0:
-; KNL-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; KNL-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; KNL-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; KNL-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; KNL-NEXT: vpextrb $0, %xmm0, %eax
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; KNL-NEXT: vpcmpnleuq %zmm1, %zmm0, %k0
+; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: andb $1, %al
; KNL-NEXT: movb $4, %cl
; KNL-NEXT: subb %al, %cl
; KNL-NEXT: movzbl %cl, %eax
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test_extractelement_v2i1:
@@ -981,15 +955,15 @@ define zeroext i8 @test_extractelement_v
define zeroext i8 @extractelement_v2i1_alt(<2 x i64> %a, <2 x i64> %b) {
; KNL-LABEL: extractelement_v2i1_alt:
; KNL: ## %bb.0:
-; KNL-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; KNL-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; KNL-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; KNL-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; KNL-NEXT: vpextrb $0, %xmm0, %eax
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; KNL-NEXT: vpcmpnleuq %zmm1, %zmm0, %k0
+; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: andb $1, %al
; KNL-NEXT: movb $4, %cl
; KNL-NEXT: subb %al, %cl
; KNL-NEXT: movzbl %cl, %eax
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: extractelement_v2i1_alt:
@@ -1011,12 +985,13 @@ define zeroext i8 @extractelement_v2i1_a
define zeroext i8 @test_extractelement_v4i1(<4 x i32> %a, <4 x i32> %b) {
; KNL-LABEL: test_extractelement_v4i1:
; KNL: ## %bb.0:
-; KNL-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; KNL-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; KNL-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; KNL-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; KNL-NEXT: vpextrd $3, %xmm0, %eax
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; KNL-NEXT: vpcmpnleud %zmm1, %zmm0, %k0
+; KNL-NEXT: kshiftrw $3, %k0, %k0
+; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: andl $1, %eax
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test_extractelement_v4i1:
@@ -1550,14 +1525,15 @@ define zeroext i8 @test_extractelement_v
; KNL-LABEL: test_extractelement_varible_v2i1:
; KNL: ## %bb.0:
; KNL-NEXT: ## kill: def %edi killed %edi def %rdi
-; KNL-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; KNL-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; KNL-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; KNL-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; KNL-NEXT: vmovdqa %xmm0, -{{[0-9]+}}(%rsp)
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; KNL-NEXT: vpcmpnleuq %zmm1, %zmm0, %k1
+; KNL-NEXT: vpternlogq $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: vextracti32x4 $0, %zmm0, -{{[0-9]+}}(%rsp)
; KNL-NEXT: andl $1, %edi
-; KNL-NEXT: movl -24(%rsp,%rdi,8), %eax
+; KNL-NEXT: movzbl -24(%rsp,%rdi,8), %eax
; KNL-NEXT: andl $1, %eax
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test_extractelement_varible_v2i1:
@@ -1580,14 +1556,15 @@ define zeroext i8 @test_extractelement_v
; KNL-LABEL: test_extractelement_varible_v4i1:
; KNL: ## %bb.0:
; KNL-NEXT: ## kill: def %edi killed %edi def %rdi
-; KNL-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; KNL-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; KNL-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; KNL-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; KNL-NEXT: vmovdqa %xmm0, -{{[0-9]+}}(%rsp)
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; KNL-NEXT: vpcmpnleud %zmm1, %zmm0, %k1
+; KNL-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: vextracti32x4 $0, %zmm0, -{{[0-9]+}}(%rsp)
; KNL-NEXT: andl $3, %edi
-; KNL-NEXT: movl -24(%rsp,%rdi,4), %eax
+; KNL-NEXT: movzbl -24(%rsp,%rdi,4), %eax
; KNL-NEXT: andl $1, %eax
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test_extractelement_varible_v4i1:
Modified: llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll Sun Jan 7 10:20:37 2018
@@ -3004,20 +3004,8 @@ declare <8 x i64> @llvm.x86.avx512.mask.
define <4 x float> @test_mask_vextractf32x4(<4 x float> %b, <16 x float> %a, i8 %mask) {
; CHECK-LABEL: test_mask_vextractf32x4:
; CHECK: ## %bb.0:
-; CHECK-NEXT: vmovd %edi, %xmm2
-; CHECK-NEXT: kmovw %edi, %k0
-; CHECK-NEXT: kshiftrw $3, %k0, %k1
-; CHECK-NEXT: kmovw %k1, %eax
-; CHECK-NEXT: kshiftrw $2, %k0, %k1
-; CHECK-NEXT: kmovw %k1, %ecx
-; CHECK-NEXT: kshiftrw $1, %k0, %k0
-; CHECK-NEXT: kmovw %k0, %edx
-; CHECK-NEXT: vpinsrb $4, %edx, %xmm2, %xmm2
-; CHECK-NEXT: vpinsrb $8, %ecx, %xmm2, %xmm2
-; CHECK-NEXT: vpinsrb $12, %eax, %xmm2, %xmm2
-; CHECK-NEXT: vextractf32x4 $2, %zmm1, %xmm1
-; CHECK-NEXT: vpslld $31, %xmm2, %xmm2
-; CHECK-NEXT: vblendvps %xmm2, %xmm1, %xmm0, %xmm0
+; CHECK-NEXT: kmovw %edi, %k1
+; CHECK-NEXT: vextractf32x4 $2, %zmm1, %xmm0 {%k1}
; CHECK-NEXT: retq
%res = call <4 x float> @llvm.x86.avx512.mask.vextractf32x4.512(<16 x float> %a, i32 2, <4 x float> %b, i8 %mask)
ret <4 x float> %res
@@ -3028,21 +3016,8 @@ declare <4 x float> @llvm.x86.avx512.mas
define <4 x i64> @test_mask_vextracti64x4(<4 x i64> %b, <8 x i64> %a, i8 %mask) {
; CHECK-LABEL: test_mask_vextracti64x4:
; CHECK: ## %bb.0:
-; CHECK-NEXT: vextractf64x4 $1, %zmm1, %ymm1
-; CHECK-NEXT: vmovd %edi, %xmm2
-; CHECK-NEXT: kmovw %edi, %k0
-; CHECK-NEXT: kshiftrw $3, %k0, %k1
-; CHECK-NEXT: kmovw %k1, %eax
-; CHECK-NEXT: kshiftrw $2, %k0, %k1
-; CHECK-NEXT: kmovw %k1, %ecx
-; CHECK-NEXT: kshiftrw $1, %k0, %k0
-; CHECK-NEXT: kmovw %k0, %edx
-; CHECK-NEXT: vpinsrb $4, %edx, %xmm2, %xmm2
-; CHECK-NEXT: vpinsrb $8, %ecx, %xmm2, %xmm2
-; CHECK-NEXT: vpinsrb $12, %eax, %xmm2, %xmm2
-; CHECK-NEXT: vpslld $31, %xmm2, %xmm2
-; CHECK-NEXT: vpmovsxdq %xmm2, %ymm2
-; CHECK-NEXT: vblendvpd %ymm2, %ymm1, %ymm0, %ymm0
+; CHECK-NEXT: kmovw %edi, %k1
+; CHECK-NEXT: vextracti64x4 $1, %zmm1, %ymm0 {%k1}
; CHECK-NEXT: retq
%res = call <4 x i64> @llvm.x86.avx512.mask.vextracti64x4.512(<8 x i64> %a, i32 1, <4 x i64> %b, i8 %mask)
ret <4 x i64> %res
@@ -3053,21 +3028,8 @@ declare <4 x i64> @llvm.x86.avx512.mask.
define <4 x i32> @test_maskz_vextracti32x4(<16 x i32> %a, i8 %mask) {
; CHECK-LABEL: test_maskz_vextracti32x4:
; CHECK: ## %bb.0:
-; CHECK-NEXT: vmovd %edi, %xmm1
-; CHECK-NEXT: kmovw %edi, %k0
-; CHECK-NEXT: kshiftrw $3, %k0, %k1
-; CHECK-NEXT: kmovw %k1, %eax
-; CHECK-NEXT: kshiftrw $2, %k0, %k1
-; CHECK-NEXT: kmovw %k1, %ecx
-; CHECK-NEXT: kshiftrw $1, %k0, %k0
-; CHECK-NEXT: kmovw %k0, %edx
-; CHECK-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; CHECK-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; CHECK-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; CHECK-NEXT: vextracti32x4 $2, %zmm0, %xmm0
-; CHECK-NEXT: vpslld $31, %xmm1, %xmm1
-; CHECK-NEXT: vpsrad $31, %xmm1, %xmm1
-; CHECK-NEXT: vpand %xmm0, %xmm1, %xmm0
+; CHECK-NEXT: kmovw %edi, %k1
+; CHECK-NEXT: vextracti32x4 $2, %zmm0, %xmm0 {%k1} {z}
; CHECK-NEXT: retq
%res = call <4 x i32> @llvm.x86.avx512.mask.vextracti32x4.512(<16 x i32> %a, i32 2, <4 x i32> zeroinitializer, i8 %mask)
ret <4 x i32> %res
Modified: llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll Sun Jan 7 10:20:37 2018
@@ -498,11 +498,15 @@ entry:
define <4 x i32> @test4(<4 x i64> %x, <4 x i64> %y, <4 x i64> %x1, <4 x i64> %y1) {
; KNL-LABEL: test4:
; KNL: ## %bb.0:
-; KNL-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; KNL-NEXT: vpmovqd %zmm0, %ymm0
-; KNL-NEXT: vpcmpgtq %ymm3, %ymm2, %ymm1
-; KNL-NEXT: vpmovqd %zmm1, %ymm1
-; KNL-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
+; KNL-NEXT: ## kill: def %ymm3 killed %ymm3 def %zmm3
+; KNL-NEXT: ## kill: def %ymm2 killed %ymm2 def %zmm2
+; KNL-NEXT: ## kill: def %ymm1 killed %ymm1 def %zmm1
+; KNL-NEXT: ## kill: def %ymm0 killed %ymm0 def %zmm0
+; KNL-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; KNL-NEXT: vpcmpgtq %zmm3, %zmm2, %k1
+; KNL-NEXT: kandnw %k0, %k1, %k1
+; KNL-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
@@ -517,21 +521,29 @@ define <4 x i32> @test4(<4 x i64> %x, <4
;
; AVX512BW-LABEL: test4:
; AVX512BW: ## %bb.0:
-; AVX512BW-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0
-; AVX512BW-NEXT: vpcmpgtq %ymm3, %ymm2, %ymm1
-; AVX512BW-NEXT: vpmovqd %zmm1, %ymm1
-; AVX512BW-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
+; AVX512BW-NEXT: ## kill: def %ymm3 killed %ymm3 def %zmm3
+; AVX512BW-NEXT: ## kill: def %ymm2 killed %ymm2 def %zmm2
+; AVX512BW-NEXT: ## kill: def %ymm1 killed %ymm1 def %zmm1
+; AVX512BW-NEXT: ## kill: def %ymm0 killed %ymm0 def %zmm0
+; AVX512BW-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; AVX512BW-NEXT: vpcmpgtq %zmm3, %zmm2, %k1
+; AVX512BW-NEXT: kandnw %k0, %k1, %k1
+; AVX512BW-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512BW-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
; AVX512BW-NEXT: vzeroupper
; AVX512BW-NEXT: retq
;
; AVX512DQ-LABEL: test4:
; AVX512DQ: ## %bb.0:
-; AVX512DQ-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; AVX512DQ-NEXT: vpmovqd %zmm0, %ymm0
-; AVX512DQ-NEXT: vpcmpgtq %ymm3, %ymm2, %ymm1
-; AVX512DQ-NEXT: vpmovqd %zmm1, %ymm1
-; AVX512DQ-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
+; AVX512DQ-NEXT: ## kill: def %ymm3 killed %ymm3 def %zmm3
+; AVX512DQ-NEXT: ## kill: def %ymm2 killed %ymm2 def %zmm2
+; AVX512DQ-NEXT: ## kill: def %ymm1 killed %ymm1 def %zmm1
+; AVX512DQ-NEXT: ## kill: def %ymm0 killed %ymm0 def %zmm0
+; AVX512DQ-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; AVX512DQ-NEXT: vpcmpgtq %zmm3, %zmm2, %k1
+; AVX512DQ-NEXT: kandnw %k0, %k1, %k0
+; AVX512DQ-NEXT: vpmovm2d %k0, %zmm0
+; AVX512DQ-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
; AVX512DQ-NEXT: vzeroupper
; AVX512DQ-NEXT: retq
%x_gt_y = icmp sgt <4 x i64> %x, %y
@@ -544,9 +556,16 @@ define <4 x i32> @test4(<4 x i64> %x, <4
define <2 x i64> @test5(<2 x i64> %x, <2 x i64> %y, <2 x i64> %x1, <2 x i64> %y1) {
; KNL-LABEL: test5:
; KNL: ## %bb.0:
-; KNL-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; KNL-NEXT: vpcmpgtq %xmm3, %xmm2, %xmm1
-; KNL-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
+; KNL-NEXT: ## kill: def %xmm3 killed %xmm3 def %zmm3
+; KNL-NEXT: ## kill: def %xmm2 killed %xmm2 def %zmm2
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; KNL-NEXT: vpcmpgtq %zmm0, %zmm1, %k0
+; KNL-NEXT: vpcmpgtq %zmm3, %zmm2, %k1
+; KNL-NEXT: kandnw %k1, %k0, %k1
+; KNL-NEXT: vpternlogq $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test5:
@@ -559,16 +578,30 @@ define <2 x i64> @test5(<2 x i64> %x, <2
;
; AVX512BW-LABEL: test5:
; AVX512BW: ## %bb.0:
-; AVX512BW-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; AVX512BW-NEXT: vpcmpgtq %xmm3, %xmm2, %xmm1
-; AVX512BW-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
+; AVX512BW-NEXT: ## kill: def %xmm3 killed %xmm3 def %zmm3
+; AVX512BW-NEXT: ## kill: def %xmm2 killed %xmm2 def %zmm2
+; AVX512BW-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512BW-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512BW-NEXT: vpcmpgtq %zmm0, %zmm1, %k0
+; AVX512BW-NEXT: vpcmpgtq %zmm3, %zmm2, %k1
+; AVX512BW-NEXT: kandnw %k1, %k0, %k1
+; AVX512BW-NEXT: vpternlogq $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512BW-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512BW-NEXT: vzeroupper
; AVX512BW-NEXT: retq
;
; AVX512DQ-LABEL: test5:
; AVX512DQ: ## %bb.0:
-; AVX512DQ-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; AVX512DQ-NEXT: vpcmpgtq %xmm3, %xmm2, %xmm1
-; AVX512DQ-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
+; AVX512DQ-NEXT: ## kill: def %xmm3 killed %xmm3 def %zmm3
+; AVX512DQ-NEXT: ## kill: def %xmm2 killed %xmm2 def %zmm2
+; AVX512DQ-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512DQ-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512DQ-NEXT: vpcmpgtq %zmm0, %zmm1, %k0
+; AVX512DQ-NEXT: vpcmpgtq %zmm3, %zmm2, %k1
+; AVX512DQ-NEXT: kandnw %k1, %k0, %k0
+; AVX512DQ-NEXT: vpmovm2q %k0, %zmm0
+; AVX512DQ-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512DQ-NEXT: vzeroupper
; AVX512DQ-NEXT: retq
%x_gt_y = icmp slt <2 x i64> %x, %y
%x1_gt_y1 = icmp sgt <2 x i64> %x1, %y1
@@ -795,10 +828,17 @@ define <4 x i1> @test11(<4 x i1>%a, <4 x
; KNL-LABEL: test11:
; KNL: ## %bb.0:
; KNL-NEXT: cmpl %esi, %edi
-; KNL-NEXT: jg LBB20_2
-; KNL-NEXT: ## %bb.1:
-; KNL-NEXT: vmovaps %xmm1, %xmm0
-; KNL-NEXT: LBB20_2:
+; KNL-NEXT: jg LBB20_1
+; KNL-NEXT: ## %bb.2:
+; KNL-NEXT: vpslld $31, %xmm1, %xmm0
+; KNL-NEXT: jmp LBB20_3
+; KNL-NEXT: LBB20_1:
+; KNL-NEXT: vpslld $31, %xmm0, %xmm0
+; KNL-NEXT: LBB20_3:
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k1
+; KNL-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test11:
@@ -818,19 +858,33 @@ define <4 x i1> @test11(<4 x i1>%a, <4 x
; AVX512BW-LABEL: test11:
; AVX512BW: ## %bb.0:
; AVX512BW-NEXT: cmpl %esi, %edi
-; AVX512BW-NEXT: jg LBB20_2
-; AVX512BW-NEXT: ## %bb.1:
-; AVX512BW-NEXT: vmovaps %xmm1, %xmm0
-; AVX512BW-NEXT: LBB20_2:
+; AVX512BW-NEXT: jg LBB20_1
+; AVX512BW-NEXT: ## %bb.2:
+; AVX512BW-NEXT: vpslld $31, %xmm1, %xmm0
+; AVX512BW-NEXT: jmp LBB20_3
+; AVX512BW-NEXT: LBB20_1:
+; AVX512BW-NEXT: vpslld $31, %xmm0, %xmm0
+; AVX512BW-NEXT: LBB20_3:
+; AVX512BW-NEXT: vptestmd %zmm0, %zmm0, %k1
+; AVX512BW-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512BW-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512BW-NEXT: vzeroupper
; AVX512BW-NEXT: retq
;
; AVX512DQ-LABEL: test11:
; AVX512DQ: ## %bb.0:
; AVX512DQ-NEXT: cmpl %esi, %edi
-; AVX512DQ-NEXT: jg LBB20_2
-; AVX512DQ-NEXT: ## %bb.1:
-; AVX512DQ-NEXT: vmovaps %xmm1, %xmm0
-; AVX512DQ-NEXT: LBB20_2:
+; AVX512DQ-NEXT: jg LBB20_1
+; AVX512DQ-NEXT: ## %bb.2:
+; AVX512DQ-NEXT: vpslld $31, %xmm1, %xmm0
+; AVX512DQ-NEXT: jmp LBB20_3
+; AVX512DQ-NEXT: LBB20_1:
+; AVX512DQ-NEXT: vpslld $31, %xmm0, %xmm0
+; AVX512DQ-NEXT: LBB20_3:
+; AVX512DQ-NEXT: vptestmd %zmm0, %zmm0, %k0
+; AVX512DQ-NEXT: vpmovm2d %k0, %zmm0
+; AVX512DQ-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512DQ-NEXT: vzeroupper
; AVX512DQ-NEXT: retq
%mask = icmp sgt i32 %a1, %b1
%c = select i1 %mask, <4 x i1>%a, <4 x i1>%b
@@ -1271,8 +1325,7 @@ define <32 x i16> @test21(<32 x i16> %x
define void @test22(<4 x i1> %a, <4 x i1>* %addr) {
; KNL-LABEL: test22:
; KNL: ## %bb.0:
-; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %ymm0
-; KNL-NEXT: vpslld $31, %ymm0, %ymm0
+; KNL-NEXT: vpslld $31, %xmm0, %xmm0
; KNL-NEXT: vptestmd %zmm0, %zmm0, %k0
; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: movb %al, (%rdi)
@@ -1288,8 +1341,7 @@ define void @test22(<4 x i1> %a, <4 x i1
;
; AVX512BW-LABEL: test22:
; AVX512BW: ## %bb.0:
-; AVX512BW-NEXT: ## kill: def %xmm0 killed %xmm0 def %ymm0
-; AVX512BW-NEXT: vpslld $31, %ymm0, %ymm0
+; AVX512BW-NEXT: vpslld $31, %xmm0, %xmm0
; AVX512BW-NEXT: vptestmd %zmm0, %zmm0, %k0
; AVX512BW-NEXT: kmovd %k0, %eax
; AVX512BW-NEXT: movb %al, (%rdi)
@@ -1298,8 +1350,7 @@ define void @test22(<4 x i1> %a, <4 x i1
;
; AVX512DQ-LABEL: test22:
; AVX512DQ: ## %bb.0:
-; AVX512DQ-NEXT: ## kill: def %xmm0 killed %xmm0 def %ymm0
-; AVX512DQ-NEXT: vpslld $31, %ymm0, %ymm0
+; AVX512DQ-NEXT: vpslld $31, %xmm0, %xmm0
; AVX512DQ-NEXT: vptestmd %zmm0, %zmm0, %k0
; AVX512DQ-NEXT: kmovb %k0, (%rdi)
; AVX512DQ-NEXT: vzeroupper
@@ -1311,8 +1362,7 @@ define void @test22(<4 x i1> %a, <4 x i1
define void @test23(<2 x i1> %a, <2 x i1>* %addr) {
; KNL-LABEL: test23:
; KNL: ## %bb.0:
-; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
-; KNL-NEXT: vpsllq $63, %zmm0, %zmm0
+; KNL-NEXT: vpsllq $63, %xmm0, %xmm0
; KNL-NEXT: vptestmq %zmm0, %zmm0, %k0
; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: movb %al, (%rdi)
@@ -1328,8 +1378,7 @@ define void @test23(<2 x i1> %a, <2 x i1
;
; AVX512BW-LABEL: test23:
; AVX512BW: ## %bb.0:
-; AVX512BW-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
-; AVX512BW-NEXT: vpsllq $63, %zmm0, %zmm0
+; AVX512BW-NEXT: vpsllq $63, %xmm0, %xmm0
; AVX512BW-NEXT: vptestmq %zmm0, %zmm0, %k0
; AVX512BW-NEXT: kmovd %k0, %eax
; AVX512BW-NEXT: movb %al, (%rdi)
@@ -1338,8 +1387,7 @@ define void @test23(<2 x i1> %a, <2 x i1
;
; AVX512DQ-LABEL: test23:
; AVX512DQ: ## %bb.0:
-; AVX512DQ-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
-; AVX512DQ-NEXT: vpsllq $63, %zmm0, %zmm0
+; AVX512DQ-NEXT: vpsllq $63, %xmm0, %xmm0
; AVX512DQ-NEXT: vptestmq %zmm0, %zmm0, %k0
; AVX512DQ-NEXT: kmovb %k0, (%rdi)
; AVX512DQ-NEXT: vzeroupper
@@ -1390,10 +1438,9 @@ define void @store_v1i1(<1 x i1> %c , <1
define void @store_v2i1(<2 x i1> %c , <2 x i1>* %ptr) {
; KNL-LABEL: store_v2i1:
; KNL: ## %bb.0:
-; KNL-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; KNL-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; KNL-NEXT: vpsllq $63, %zmm0, %zmm0
+; KNL-NEXT: vpsllq $63, %xmm0, %xmm0
; KNL-NEXT: vptestmq %zmm0, %zmm0, %k0
+; KNL-NEXT: knotw %k0, %k0
; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: movb %al, (%rdi)
; KNL-NEXT: vzeroupper
@@ -1409,10 +1456,9 @@ define void @store_v2i1(<2 x i1> %c , <2
;
; AVX512BW-LABEL: store_v2i1:
; AVX512BW: ## %bb.0:
-; AVX512BW-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; AVX512BW-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; AVX512BW-NEXT: vpsllq $63, %zmm0, %zmm0
+; AVX512BW-NEXT: vpsllq $63, %xmm0, %xmm0
; AVX512BW-NEXT: vptestmq %zmm0, %zmm0, %k0
+; AVX512BW-NEXT: knotw %k0, %k0
; AVX512BW-NEXT: kmovd %k0, %eax
; AVX512BW-NEXT: movb %al, (%rdi)
; AVX512BW-NEXT: vzeroupper
@@ -1420,10 +1466,9 @@ define void @store_v2i1(<2 x i1> %c , <2
;
; AVX512DQ-LABEL: store_v2i1:
; AVX512DQ: ## %bb.0:
-; AVX512DQ-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; AVX512DQ-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; AVX512DQ-NEXT: vpsllq $63, %zmm0, %zmm0
+; AVX512DQ-NEXT: vpsllq $63, %xmm0, %xmm0
; AVX512DQ-NEXT: vptestmq %zmm0, %zmm0, %k0
+; AVX512DQ-NEXT: knotw %k0, %k0
; AVX512DQ-NEXT: kmovb %k0, (%rdi)
; AVX512DQ-NEXT: vzeroupper
; AVX512DQ-NEXT: retq
@@ -1435,10 +1480,9 @@ define void @store_v2i1(<2 x i1> %c , <2
define void @store_v4i1(<4 x i1> %c , <4 x i1>* %ptr) {
; KNL-LABEL: store_v4i1:
; KNL: ## %bb.0:
-; KNL-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; KNL-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; KNL-NEXT: vpslld $31, %ymm0, %ymm0
+; KNL-NEXT: vpslld $31, %xmm0, %xmm0
; KNL-NEXT: vptestmd %zmm0, %zmm0, %k0
+; KNL-NEXT: knotw %k0, %k0
; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: movb %al, (%rdi)
; KNL-NEXT: vzeroupper
@@ -1454,10 +1498,9 @@ define void @store_v4i1(<4 x i1> %c , <4
;
; AVX512BW-LABEL: store_v4i1:
; AVX512BW: ## %bb.0:
-; AVX512BW-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; AVX512BW-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; AVX512BW-NEXT: vpslld $31, %ymm0, %ymm0
+; AVX512BW-NEXT: vpslld $31, %xmm0, %xmm0
; AVX512BW-NEXT: vptestmd %zmm0, %zmm0, %k0
+; AVX512BW-NEXT: knotw %k0, %k0
; AVX512BW-NEXT: kmovd %k0, %eax
; AVX512BW-NEXT: movb %al, (%rdi)
; AVX512BW-NEXT: vzeroupper
@@ -1465,10 +1508,9 @@ define void @store_v4i1(<4 x i1> %c , <4
;
; AVX512DQ-LABEL: store_v4i1:
; AVX512DQ: ## %bb.0:
-; AVX512DQ-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; AVX512DQ-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; AVX512DQ-NEXT: vpslld $31, %ymm0, %ymm0
+; AVX512DQ-NEXT: vpslld $31, %xmm0, %xmm0
; AVX512DQ-NEXT: vptestmd %zmm0, %zmm0, %k0
+; AVX512DQ-NEXT: knotw %k0, %k0
; AVX512DQ-NEXT: kmovb %k0, (%rdi)
; AVX512DQ-NEXT: vzeroupper
; AVX512DQ-NEXT: retq
Modified: llvm/trunk/test/CodeGen/X86/avx512-vec-cmp.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-vec-cmp.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512-vec-cmp.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-vec-cmp.ll Sun Jan 7 10:20:37 2018
@@ -72,9 +72,13 @@ define <8 x i64> @test6_unsigned(<8 x i6
define <4 x float> @test7(<4 x float> %a, <4 x float> %b) {
; KNL-LABEL: test7:
; KNL: ## %bb.0:
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: vxorps %xmm2, %xmm2, %xmm2
-; KNL-NEXT: vcmpltps %xmm2, %xmm0, %xmm2
-; KNL-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; KNL-NEXT: vcmpltps %zmm2, %zmm0, %k1
+; KNL-NEXT: vblendmps %zmm0, %zmm1, %zmm0 {%k1}
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test7:
@@ -92,9 +96,13 @@ define <4 x float> @test7(<4 x float> %a
define <2 x double> @test8(<2 x double> %a, <2 x double> %b) {
; KNL-LABEL: test8:
; KNL: ## %bb.0:
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: vxorpd %xmm2, %xmm2, %xmm2
-; KNL-NEXT: vcmpltpd %xmm2, %xmm0, %xmm2
-; KNL-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
+; KNL-NEXT: vcmpltpd %zmm2, %zmm0, %k1
+; KNL-NEXT: vblendmpd %zmm0, %zmm1, %zmm0 {%k1}
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test8:
@@ -537,8 +545,11 @@ define <16 x i8>@test29(<16 x i32> %x, <
define <4 x double> @test30(<4 x double> %x, <4 x double> %y) nounwind {
; KNL-LABEL: test30:
; KNL: ## %bb.0:
-; KNL-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm2
-; KNL-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; KNL-NEXT: ## kill: def %ymm1 killed %ymm1 def %zmm1
+; KNL-NEXT: ## kill: def %ymm0 killed %ymm0 def %zmm0
+; KNL-NEXT: vcmpeqpd %zmm1, %zmm0, %k1
+; KNL-NEXT: vblendmpd %zmm0, %zmm1, %zmm0 {%k1}
+; KNL-NEXT: ## kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: test30:
@@ -555,8 +566,13 @@ define <4 x double> @test30(<4 x double>
define <2 x double> @test31(<2 x double> %x, <2 x double> %x1, <2 x double>* %yp) nounwind {
; KNL-LABEL: test31:
; KNL: ## %bb.0:
-; KNL-NEXT: vcmpltpd (%rdi), %xmm0, %xmm2
-; KNL-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; KNL-NEXT: vmovupd (%rdi), %xmm2
+; KNL-NEXT: vcmpltpd %zmm2, %zmm0, %k1
+; KNL-NEXT: vblendmpd %zmm0, %zmm1, %zmm0 {%k1}
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test31:
@@ -574,8 +590,12 @@ define <2 x double> @test31(<2 x double>
define <4 x double> @test32(<4 x double> %x, <4 x double> %x1, <4 x double>* %yp) nounwind {
; KNL-LABEL: test32:
; KNL: ## %bb.0:
-; KNL-NEXT: vcmpltpd (%rdi), %ymm0, %ymm2
-; KNL-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; KNL-NEXT: ## kill: def %ymm1 killed %ymm1 def %zmm1
+; KNL-NEXT: ## kill: def %ymm0 killed %ymm0 def %zmm0
+; KNL-NEXT: vmovupd (%rdi), %ymm2
+; KNL-NEXT: vcmpltpd %zmm2, %zmm0, %k1
+; KNL-NEXT: vblendmpd %zmm0, %zmm1, %zmm0 {%k1}
+; KNL-NEXT: ## kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: test32:
@@ -605,8 +625,13 @@ define <8 x double> @test33(<8 x double>
define <4 x float> @test34(<4 x float> %x, <4 x float> %x1, <4 x float>* %yp) nounwind {
; KNL-LABEL: test34:
; KNL: ## %bb.0:
-; KNL-NEXT: vcmpltps (%rdi), %xmm0, %xmm2
-; KNL-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; KNL-NEXT: vmovups (%rdi), %xmm2
+; KNL-NEXT: vcmpltps %zmm2, %zmm0, %k1
+; KNL-NEXT: vblendmps %zmm0, %zmm1, %zmm0 {%k1}
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test34:
@@ -674,9 +699,12 @@ define <8 x double> @test37(<8 x double>
define <4 x double> @test38(<4 x double> %x, <4 x double> %x1, double* %ptr) nounwind {
; KNL-LABEL: test38:
; KNL: ## %bb.0:
+; KNL-NEXT: ## kill: def %ymm1 killed %ymm1 def %zmm1
+; KNL-NEXT: ## kill: def %ymm0 killed %ymm0 def %zmm0
; KNL-NEXT: vbroadcastsd (%rdi), %ymm2
-; KNL-NEXT: vcmpltpd %ymm2, %ymm0, %ymm2
-; KNL-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; KNL-NEXT: vcmpltpd %zmm2, %zmm0, %k1
+; KNL-NEXT: vblendmpd %zmm0, %zmm1, %zmm0 {%k1}
+; KNL-NEXT: ## kill: def %ymm0 killed %ymm0 killed %zmm0
; KNL-NEXT: retq
;
; SKX-LABEL: test38:
@@ -697,9 +725,13 @@ define <4 x double> @test38(<4 x double>
define <2 x double> @test39(<2 x double> %x, <2 x double> %x1, double* %ptr) nounwind {
; KNL-LABEL: test39:
; KNL: ## %bb.0:
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: vmovddup {{.*#+}} xmm2 = mem[0,0]
-; KNL-NEXT: vcmpltpd %xmm2, %xmm0, %xmm2
-; KNL-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
+; KNL-NEXT: vcmpltpd %zmm2, %zmm0, %k1
+; KNL-NEXT: vblendmpd %zmm0, %zmm1, %zmm0 {%k1}
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test39:
@@ -763,9 +795,13 @@ define <8 x float> @test41(<8 x float>
define <4 x float> @test42(<4 x float> %x, <4 x float> %x1, float* %ptr) nounwind {
; KNL-LABEL: test42:
; KNL: ## %bb.0:
+; KNL-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: vbroadcastss (%rdi), %xmm2
-; KNL-NEXT: vcmpltps %xmm2, %xmm0, %xmm2
-; KNL-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; KNL-NEXT: vcmpltps %zmm2, %zmm0, %k1
+; KNL-NEXT: vblendmps %zmm0, %zmm1, %zmm0 {%k1}
+; KNL-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test42:
Modified: llvm/trunk/test/CodeGen/X86/avx512dq-intrinsics-upgrade.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512dq-intrinsics-upgrade.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512dq-intrinsics-upgrade.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512dq-intrinsics-upgrade.ll Sun Jan 7 10:20:37 2018
@@ -6,18 +6,12 @@ declare <2 x double> @llvm.x86.avx512.ma
define <2 x double>@test_int_x86_avx512_mask_vextractf64x2_512(<8 x double> %x0, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vextractf64x2_512:
; CHECK: ## %bb.0:
-; CHECK-NEXT: vextractf128 $1, %ymm0, %xmm0
-; CHECK-NEXT: vmovd %edi, %xmm2
-; CHECK-NEXT: kmovw %edi, %k0
-; CHECK-NEXT: kshiftrb $1, %k0, %k0
-; CHECK-NEXT: kmovw %k0, %eax
-; CHECK-NEXT: vpinsrb $8, %eax, %xmm2, %xmm2
-; CHECK-NEXT: vpsllq $63, %xmm2, %xmm2
-; CHECK-NEXT: vpsraq $63, %zmm2, %zmm2
-; CHECK-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm1
-; CHECK-NEXT: vandpd %xmm0, %xmm2, %xmm2
-; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0
-; CHECK-NEXT: vaddpd %xmm0, %xmm2, %xmm0
+; CHECK-NEXT: vextractf128 $1, %ymm0, %xmm2
+; CHECK-NEXT: kmovw %edi, %k1
+; CHECK-NEXT: vextractf64x2 $1, %zmm0, %xmm1 {%k1}
+; CHECK-NEXT: vextractf64x2 $1, %zmm0, %xmm0 {%k1} {z}
+; CHECK-NEXT: vaddpd %xmm2, %xmm1, %xmm1
+; CHECK-NEXT: vaddpd %xmm1, %xmm0, %xmm0
; CHECK-NEXT: retq
%res = call <2 x double> @llvm.x86.avx512.mask.vextractf64x2.512(<8 x double> %x0,i32 1, <2 x double> %x2, i8 %x3)
%res2 = call <2 x double> @llvm.x86.avx512.mask.vextractf64x2.512(<8 x double> %x0,i32 1, <2 x double> zeroinitializer, i8 %x3)
Modified: llvm/trunk/test/CodeGen/X86/avx512vl-vec-cmp.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512vl-vec-cmp.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512vl-vec-cmp.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512vl-vec-cmp.ll Sun Jan 7 10:20:37 2018
@@ -11,8 +11,11 @@ define <4 x i64> @test256_1(<4 x i64> %x
;
; NoVLX-LABEL: test256_1:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm2
-; NoVLX-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k1
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NoVLX-NEXT: retq
%mask = icmp eq <4 x i64> %x, %y
%max = select <4 x i1> %mask, <4 x i64> %x, <4 x i64> %y
@@ -28,8 +31,12 @@ define <4 x i64> @test256_2(<4 x i64> %x
;
; NoVLX-LABEL: test256_2:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vblendvpd %ymm0, %ymm2, %ymm1, %ymm0
+; NoVLX-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k1
+; NoVLX-NEXT: vpblendmq %zmm2, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NoVLX-NEXT: retq
%mask = icmp sgt <4 x i64> %x, %y
%max = select <4 x i1> %mask, <4 x i64> %x1, <4 x i64> %y
@@ -66,11 +73,12 @@ define <4 x i64> @test256_4(<4 x i64> %x
;
; NoVLX-LABEL: test256_4:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm3, %ymm1, %ymm4
-; NoVLX-NEXT: vpxor %ymm3, %ymm0, %ymm0
-; NoVLX-NEXT: vpcmpgtq %ymm4, %ymm0, %ymm0
-; NoVLX-NEXT: vblendvpd %ymm0, %ymm2, %ymm1, %ymm0
+; NoVLX-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpnleuq %zmm1, %zmm0, %k1
+; NoVLX-NEXT: vpblendmq %zmm2, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NoVLX-NEXT: retq
%mask = icmp ugt <4 x i64> %x, %y
%max = select <4 x i1> %mask, <4 x i64> %x1, <4 x i64> %y
@@ -289,12 +297,14 @@ define <4 x i64> @test256_10(<4 x i64> %
;
; NoVLX-LABEL: test256_10:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtq %ymm2, %ymm3, %ymm3
-; NoVLX-NEXT: vpcmpeqd %ymm4, %ymm4, %ymm4
-; NoVLX-NEXT: vpxor %ymm4, %ymm3, %ymm3
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm1
-; NoVLX-NEXT: vpandn %ymm3, %ymm1, %ymm1
-; NoVLX-NEXT: vblendvpd %ymm1, %ymm0, %ymm2, %ymm0
+; NoVLX-NEXT: # kill: def %ymm3 killed %ymm3 def %zmm3
+; NoVLX-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm1, %zmm0, %k1
+; NoVLX-NEXT: vpcmpleq %zmm2, %zmm3, %k1 {%k1}
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm2, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NoVLX-NEXT: retq
%mask1 = icmp sge <4 x i64> %x1, %y1
%mask0 = icmp sle <4 x i64> %x, %y
@@ -313,10 +323,14 @@ define <4 x i64> @test256_11(<4 x i64> %
;
; NoVLX-LABEL: test256_11:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtq (%rdi), %ymm0, %ymm3
-; NoVLX-NEXT: vpcmpgtq %ymm2, %ymm1, %ymm2
-; NoVLX-NEXT: vpand %ymm2, %ymm3, %ymm2
-; NoVLX-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; NoVLX-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %ymm3
+; NoVLX-NEXT: vpcmpgtq %zmm3, %zmm0, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm2, %zmm1, %k1 {%k1}
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NoVLX-NEXT: retq
%mask1 = icmp sgt <4 x i64> %x1, %y1
%y = load <4 x i64>, <4 x i64>* %y.ptr, align 4
@@ -362,9 +376,12 @@ define <4 x i64> @test256_13(<4 x i64> %
;
; NoVLX-LABEL: test256_13:
; NoVLX: # %bb.0:
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm2
-; NoVLX-NEXT: vpcmpeqq %ymm2, %ymm0, %ymm2
-; NoVLX-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; NoVLX-NEXT: vpcmpeqq %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NoVLX-NEXT: retq
%yb = load i64, i64* %yb.ptr, align 4
%y.0 = insertelement <4 x i64> undef, i64 %yb, i32 0
@@ -437,11 +454,14 @@ define <4 x i64> @test256_16(<4 x i64> %
;
; NoVLX-LABEL: test256_16:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm2
+; NoVLX-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm3
-; NoVLX-NEXT: vpcmpgtq %ymm3, %ymm0, %ymm3
-; NoVLX-NEXT: vpandn %ymm3, %ymm2, %ymm2
-; NoVLX-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; NoVLX-NEXT: vpcmpgtq %zmm3, %zmm0, %k1
+; NoVLX-NEXT: vpcmpleq %zmm1, %zmm2, %k1 {%k1}
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; NoVLX-NEXT: retq
%mask1 = icmp sge <4 x i64> %x1, %y1
%yb = load i64, i64* %yb.ptr, align 4
@@ -550,8 +570,11 @@ define <2 x i64> @test128_1(<2 x i64> %x
;
; NoVLX-LABEL: test128_1:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm2
-; NoVLX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k1
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask = icmp eq <2 x i64> %x, %y
%max = select <2 x i1> %mask, <2 x i64> %x, <2 x i64> %y
@@ -567,8 +590,12 @@ define <2 x i64> @test128_2(<2 x i64> %x
;
; NoVLX-LABEL: test128_2:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vblendvpd %xmm0, %xmm2, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k1
+; NoVLX-NEXT: vpblendmq %zmm2, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask = icmp sgt <2 x i64> %x, %y
%max = select <2 x i1> %mask, <2 x i64> %x1, <2 x i64> %y
@@ -584,10 +611,12 @@ define <4 x i32> @test128_3(<4 x i32> %x
;
; NoVLX-LABEL: test128_3:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm3, %xmm3
-; NoVLX-NEXT: vpxor %xmm3, %xmm0, %xmm0
-; NoVLX-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k1
+; NoVLX-NEXT: vpblendmd %zmm2, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask = icmp sge <4 x i32> %x, %y
%max = select <4 x i1> %mask, <4 x i32> %x1, <4 x i32> %y
@@ -603,11 +632,12 @@ define <2 x i64> @test128_4(<2 x i64> %x
;
; NoVLX-LABEL: test128_4:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm3, %xmm1, %xmm4
-; NoVLX-NEXT: vpxor %xmm3, %xmm0, %xmm0
-; NoVLX-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm0
-; NoVLX-NEXT: vblendvpd %xmm0, %xmm2, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpnleuq %zmm1, %zmm0, %k1
+; NoVLX-NEXT: vpblendmq %zmm2, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask = icmp ugt <2 x i64> %x, %y
%max = select <2 x i1> %mask, <2 x i64> %x1, <2 x i64> %y
@@ -623,8 +653,12 @@ define <4 x i32> @test128_5(<4 x i32> %x
;
; NoVLX-LABEL: test128_5:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpeqd (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpeqd %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %yp, align 4
%mask = icmp eq <4 x i32> %x, %y
@@ -641,8 +675,12 @@ define <4 x i32> @test128_5b(<4 x i32> %
;
; NoVLX-LABEL: test128_5b:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpeqd (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpeqd %zmm0, %zmm2, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %yp, align 4
%mask = icmp eq <4 x i32> %y, %x
@@ -659,8 +697,12 @@ define <4 x i32> @test128_6(<4 x i32> %x
;
; NoVLX-LABEL: test128_6:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtd (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpgtd %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp sgt <4 x i32> %x, %y
@@ -677,8 +719,12 @@ define <4 x i32> @test128_6b(<4 x i32> %
;
; NoVLX-LABEL: test128_6b:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtd (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpgtd %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp slt <4 x i32> %y, %x
@@ -695,10 +741,12 @@ define <4 x i32> @test128_7(<4 x i32> %x
;
; NoVLX-LABEL: test128_7:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtd (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm3, %xmm3
-; NoVLX-NEXT: vpxor %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpled %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp sle <4 x i32> %x, %y
@@ -715,10 +763,12 @@ define <4 x i32> @test128_7b(<4 x i32> %
;
; NoVLX-LABEL: test128_7b:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtd (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm3, %xmm3
-; NoVLX-NEXT: vpxor %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpled %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp sge <4 x i32> %y, %x
@@ -735,9 +785,12 @@ define <4 x i32> @test128_8(<4 x i32> %x
;
; NoVLX-LABEL: test128_8:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpminud (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpleud %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp ule <4 x i32> %x, %y
@@ -754,10 +807,12 @@ define <4 x i32> @test128_8b(<4 x i32> %
;
; NoVLX-LABEL: test128_8b:
; NoVLX: # %bb.0:
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
-; NoVLX-NEXT: vpmaxud %xmm0, %xmm2, %xmm3
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: vpcmpnltud %zmm0, %zmm2, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp uge <4 x i32> %y, %x
@@ -775,10 +830,14 @@ define <4 x i32> @test128_9(<4 x i32> %x
;
; NoVLX-LABEL: test128_9:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm3
-; NoVLX-NEXT: vpand %xmm2, %xmm3, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm3 killed %xmm3 def %zmm3
+; NoVLX-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm3, %zmm2, %k1 {%k1}
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask1 = icmp eq <4 x i32> %x1, %y1
%mask0 = icmp eq <4 x i32> %x, %y
@@ -797,12 +856,14 @@ define <2 x i64> @test128_10(<2 x i64> %
;
; NoVLX-LABEL: test128_10:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm3
-; NoVLX-NEXT: vpcmpeqd %xmm4, %xmm4, %xmm4
-; NoVLX-NEXT: vpxor %xmm4, %xmm3, %xmm3
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1
-; NoVLX-NEXT: vpandn %xmm3, %xmm1, %xmm1
-; NoVLX-NEXT: vblendvpd %xmm1, %xmm0, %xmm2, %xmm0
+; NoVLX-NEXT: # kill: def %xmm3 killed %xmm3 def %zmm3
+; NoVLX-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm1, %zmm0, %k1
+; NoVLX-NEXT: vpcmpleq %zmm2, %zmm3, %k1 {%k1}
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm2, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask1 = icmp sge <2 x i64> %x1, %y1
%mask0 = icmp sle <2 x i64> %x, %y
@@ -821,10 +882,14 @@ define <2 x i64> @test128_11(<2 x i64> %
;
; NoVLX-LABEL: test128_11:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtq (%rdi), %xmm0, %xmm3
-; NoVLX-NEXT: vpcmpgtq %xmm2, %xmm1, %xmm2
-; NoVLX-NEXT: vpand %xmm2, %xmm3, %xmm2
-; NoVLX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm3
+; NoVLX-NEXT: vpcmpgtq %zmm3, %zmm0, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm2, %zmm1, %k1 {%k1}
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask1 = icmp sgt <2 x i64> %x1, %y1
%y = load <2 x i64>, <2 x i64>* %y.ptr, align 4
@@ -844,11 +909,14 @@ define <4 x i32> @test128_12(<4 x i32> %
;
; NoVLX-LABEL: test128_12:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm2, %xmm2
-; NoVLX-NEXT: vpminud (%rdi), %xmm0, %xmm3
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm0, %xmm3
-; NoVLX-NEXT: vpandn %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm3
+; NoVLX-NEXT: vpcmpleud %zmm3, %zmm0, %k1
+; NoVLX-NEXT: vpcmpled %zmm1, %zmm2, %k1 {%k1}
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask1 = icmp sge <4 x i32> %x1, %y1
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
@@ -867,9 +935,12 @@ define <2 x i64> @test128_13(<2 x i64> %
;
; NoVLX-LABEL: test128_13:
; NoVLX: # %bb.0:
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm2
-; NoVLX-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm2
-; NoVLX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: vpcmpeqq %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%yb = load i64, i64* %yb.ptr, align 4
%y.0 = insertelement <2 x i64> undef, i64 %yb, i32 0
@@ -888,11 +959,12 @@ define <4 x i32> @test128_14(<4 x i32> %
;
; NoVLX-LABEL: test128_14:
; NoVLX: # %bb.0:
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm2
-; NoVLX-NEXT: vpcmpgtd %xmm2, %xmm0, %xmm2
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm3, %xmm3
-; NoVLX-NEXT: vpxor %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: vpcmpled %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%yb = load i32, i32* %yb.ptr, align 4
%y.0 = insertelement <4 x i32> undef, i32 %yb, i32 0
@@ -912,11 +984,14 @@ define <4 x i32> @test128_15(<4 x i32> %
;
; NoVLX-LABEL: test128_15:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm2, %xmm2
+; NoVLX-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm3
-; NoVLX-NEXT: vpcmpgtd %xmm3, %xmm0, %xmm3
-; NoVLX-NEXT: vpandn %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: vpcmpgtd %zmm3, %zmm0, %k1
+; NoVLX-NEXT: vpcmpled %zmm1, %zmm2, %k1 {%k1}
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask1 = icmp sge <4 x i32> %x1, %y1
%yb = load i32, i32* %yb.ptr, align 4
@@ -938,11 +1013,14 @@ define <2 x i64> @test128_16(<2 x i64> %
;
; NoVLX-LABEL: test128_16:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm2
+; NoVLX-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm3
-; NoVLX-NEXT: vpcmpgtq %xmm3, %xmm0, %xmm3
-; NoVLX-NEXT: vpandn %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: vpcmpgtq %zmm3, %zmm0, %k1
+; NoVLX-NEXT: vpcmpleq %zmm1, %zmm2, %k1 {%k1}
+; NoVLX-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%mask1 = icmp sge <2 x i64> %x1, %y1
%yb = load i64, i64* %yb.ptr, align 4
@@ -963,10 +1041,12 @@ define <4 x i32> @test128_17(<4 x i32> %
;
; NoVLX-LABEL: test128_17:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpeqd (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm3, %xmm3
-; NoVLX-NEXT: vpxor %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpneqd %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp ne <4 x i32> %x, %y
@@ -983,10 +1063,12 @@ define <4 x i32> @test128_18(<4 x i32> %
;
; NoVLX-LABEL: test128_18:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpcmpeqd (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm3, %xmm3
-; NoVLX-NEXT: vpxor %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpneqd %zmm0, %zmm2, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp ne <4 x i32> %y, %x
@@ -1003,9 +1085,12 @@ define <4 x i32> @test128_19(<4 x i32> %
;
; NoVLX-LABEL: test128_19:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpmaxud (%rdi), %xmm0, %xmm2
-; NoVLX-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
+; NoVLX-NEXT: vpcmpnltud %zmm2, %zmm0, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp uge <4 x i32> %x, %y
@@ -1022,10 +1107,12 @@ define <4 x i32> @test128_20(<4 x i32> %
;
; NoVLX-LABEL: test128_20:
; NoVLX: # %bb.0:
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqu (%rdi), %xmm2
-; NoVLX-NEXT: vpmaxud %xmm0, %xmm2, %xmm3
-; NoVLX-NEXT: vpcmpeqd %xmm3, %xmm2, %xmm2
-; NoVLX-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: vpcmpnltud %zmm0, %zmm2, %k1
+; NoVLX-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; NoVLX-NEXT: retq
%y = load <4 x i32>, <4 x i32>* %y.ptr, align 4
%mask = icmp uge <4 x i32> %y, %x
Modified: llvm/trunk/test/CodeGen/X86/avx512vl-vec-masked-cmp.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512vl-vec-masked-cmp.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512vl-vec-masked-cmp.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512vl-vec-masked-cmp.ll Sun Jan 7 10:20:37 2018
@@ -2329,37 +2329,14 @@ define zeroext i8 @test_vpcmpeqd_v4i1_v8
;
; NoVLX-LABEL: test_vpcmpeqd_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2380,37 +2357,14 @@ define zeroext i8 @test_vpcmpeqd_v4i1_v8
;
; NoVLX-LABEL: test_vpcmpeqd_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2433,50 +2387,15 @@ define zeroext i8 @test_masked_vpcmpeqd_
;
; NoVLX-LABEL: test_masked_vpcmpeqd_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2501,50 +2420,15 @@ define zeroext i8 @test_masked_vpcmpeqd_
;
; NoVLX-LABEL: test_masked_vpcmpeqd_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2570,38 +2454,14 @@ define zeroext i8 @test_vpcmpeqd_v4i1_v8
;
; NoVLX-LABEL: test_vpcmpeqd_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2625,51 +2485,15 @@ define zeroext i8 @test_masked_vpcmpeqd_
;
; NoVLX-LABEL: test_masked_vpcmpeqd_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2696,37 +2520,14 @@ define zeroext i16 @test_vpcmpeqd_v4i1_v
;
; NoVLX-LABEL: test_vpcmpeqd_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2747,37 +2548,14 @@ define zeroext i16 @test_vpcmpeqd_v4i1_v
;
; NoVLX-LABEL: test_vpcmpeqd_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2800,50 +2578,15 @@ define zeroext i16 @test_masked_vpcmpeqd
;
; NoVLX-LABEL: test_masked_vpcmpeqd_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2868,50 +2611,15 @@ define zeroext i16 @test_masked_vpcmpeqd
;
; NoVLX-LABEL: test_masked_vpcmpeqd_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2937,38 +2645,14 @@ define zeroext i16 @test_vpcmpeqd_v4i1_v
;
; NoVLX-LABEL: test_vpcmpeqd_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -2992,51 +2676,15 @@ define zeroext i16 @test_masked_vpcmpeqd
;
; NoVLX-LABEL: test_masked_vpcmpeqd_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -3069,13 +2717,25 @@ define zeroext i32 @test_vpcmpeqd_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -3108,13 +2768,25 @@ define zeroext i32 @test_vpcmpeqd_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -3149,8 +2821,10 @@ define zeroext i32 @test_masked_vpcmpeqd
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -3158,16 +2832,13 @@ define zeroext i32 @test_masked_vpcmpeqd
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -3206,8 +2877,10 @@ define zeroext i32 @test_masked_vpcmpeqd
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -3215,16 +2888,13 @@ define zeroext i32 @test_masked_vpcmpeqd
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -3264,14 +2934,25 @@ define zeroext i32 @test_vpcmpeqd_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -3307,9 +2988,10 @@ define zeroext i32 @test_masked_vpcmpeqd
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -3317,16 +2999,13 @@ define zeroext i32 @test_masked_vpcmpeqd
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -3367,16 +3046,27 @@ define zeroext i64 @test_vpcmpeqd_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -3412,16 +3102,27 @@ define zeroext i64 @test_vpcmpeqd_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqd (%rdi), %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -3459,8 +3160,10 @@ define zeroext i64 @test_masked_vpcmpeqd
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -3468,19 +3171,15 @@ define zeroext i64 @test_masked_vpcmpeqd
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -3522,8 +3221,10 @@ define zeroext i64 @test_masked_vpcmpeqd
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -3531,19 +3232,15 @@ define zeroext i64 @test_masked_vpcmpeqd
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -3586,17 +3283,27 @@ define zeroext i64 @test_vpcmpeqd_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -3635,9 +3342,10 @@ define zeroext i64 @test_masked_vpcmpeqd
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -3645,19 +3353,15 @@ define zeroext i64 @test_masked_vpcmpeqd
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -5264,13 +4968,14 @@ define zeroext i4 @test_vpcmpeqq_v2i1_v4
;
; NoVLX-LABEL: test_vpcmpeqq_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -5291,13 +4996,14 @@ define zeroext i4 @test_vpcmpeqq_v2i1_v4
;
; NoVLX-LABEL: test_vpcmpeqq_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -5320,20 +5026,15 @@ define zeroext i4 @test_masked_vpcmpeqq_
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -5358,20 +5059,15 @@ define zeroext i4 @test_masked_vpcmpeqq_
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -5397,14 +5093,14 @@ define zeroext i4 @test_vpcmpeqq_v2i1_v4
;
; NoVLX-LABEL: test_vpcmpeqq_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -5428,21 +5124,15 @@ define zeroext i4 @test_masked_vpcmpeqq_
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -5470,23 +5160,14 @@ define zeroext i8 @test_vpcmpeqq_v2i1_v8
;
; NoVLX-LABEL: test_vpcmpeqq_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5507,23 +5188,14 @@ define zeroext i8 @test_vpcmpeqq_v2i1_v8
;
; NoVLX-LABEL: test_vpcmpeqq_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5546,30 +5218,15 @@ define zeroext i8 @test_masked_vpcmpeqq_
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5594,30 +5251,15 @@ define zeroext i8 @test_masked_vpcmpeqq_
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5643,24 +5285,14 @@ define zeroext i8 @test_vpcmpeqq_v2i1_v8
;
; NoVLX-LABEL: test_vpcmpeqq_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5684,31 +5316,15 @@ define zeroext i8 @test_masked_vpcmpeqq_
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5735,23 +5351,14 @@ define zeroext i16 @test_vpcmpeqq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpeqq_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5772,23 +5379,14 @@ define zeroext i16 @test_vpcmpeqq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpeqq_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5811,30 +5409,15 @@ define zeroext i16 @test_masked_vpcmpeqq
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5859,30 +5442,15 @@ define zeroext i16 @test_masked_vpcmpeqq
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5908,24 +5476,14 @@ define zeroext i16 @test_vpcmpeqq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpeqq_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -5949,31 +5507,15 @@ define zeroext i16 @test_masked_vpcmpeqq
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -6006,13 +5548,19 @@ define zeroext i32 @test_vpcmpeqq_v2i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -6045,13 +5593,19 @@ define zeroext i32 @test_vpcmpeqq_v2i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqq (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -6086,19 +5640,18 @@ define zeroext i32 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -6137,19 +5690,18 @@ define zeroext i32 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -6189,14 +5741,19 @@ define zeroext i32 @test_vpcmpeqq_v2i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -6232,20 +5789,18 @@ define zeroext i32 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -6286,16 +5841,21 @@ define zeroext i64 @test_vpcmpeqq_v2i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -6331,16 +5891,21 @@ define zeroext i64 @test_vpcmpeqq_v2i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqq (%rdi), %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -6378,22 +5943,20 @@ define zeroext i64 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -6435,22 +5998,20 @@ define zeroext i64 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -6493,17 +6054,21 @@ define zeroext i64 @test_vpcmpeqq_v2i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -6542,23 +6107,20 @@ define zeroext i64 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -6597,36 +6159,11 @@ define zeroext i8 @test_vpcmpeqq_v4i1_v8
;
; NoVLX-LABEL: test_vpcmpeqq_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -6651,36 +6188,11 @@ define zeroext i8 @test_vpcmpeqq_v4i1_v8
;
; NoVLX-LABEL: test_vpcmpeqq_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -6707,49 +6219,12 @@ define zeroext i8 @test_masked_vpcmpeqq_
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -6778,49 +6253,12 @@ define zeroext i8 @test_masked_vpcmpeqq_
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -6850,37 +6288,11 @@ define zeroext i8 @test_vpcmpeqq_v4i1_v8
;
; NoVLX-LABEL: test_vpcmpeqq_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -6908,50 +6320,12 @@ define zeroext i8 @test_masked_vpcmpeqq_
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -6982,36 +6356,11 @@ define zeroext i16 @test_vpcmpeqq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpeqq_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -7036,36 +6385,11 @@ define zeroext i16 @test_vpcmpeqq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpeqq_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -7092,49 +6416,12 @@ define zeroext i16 @test_masked_vpcmpeqq
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -7163,49 +6450,12 @@ define zeroext i16 @test_masked_vpcmpeqq
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpeqq (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -7235,37 +6485,11 @@ define zeroext i16 @test_vpcmpeqq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpeqq_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -7293,50 +6517,12 @@ define zeroext i16 @test_masked_vpcmpeqq
;
; NoVLX-LABEL: test_masked_vpcmpeqq_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -7373,13 +6559,23 @@ define zeroext i32 @test_vpcmpeqq_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7415,14 +6611,24 @@ define zeroext i32 @test_vpcmpeqq_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqq (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
-; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
+; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
+; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
@@ -7459,9 +6665,10 @@ define zeroext i32 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -7469,16 +6676,13 @@ define zeroext i32 @test_masked_vpcmpeqq
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7518,9 +6722,10 @@ define zeroext i32 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpeqq (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -7528,16 +6733,13 @@ define zeroext i32 @test_masked_vpcmpeqq
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7578,14 +6780,23 @@ define zeroext i32 @test_vpcmpeqq_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7624,10 +6835,10 @@ define zeroext i32 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -7635,16 +6846,13 @@ define zeroext i32 @test_masked_vpcmpeqq
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7686,16 +6894,25 @@ define zeroext i64 @test_vpcmpeqq_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7734,16 +6951,25 @@ define zeroext i64 @test_vpcmpeqq_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqq (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7784,9 +7010,10 @@ define zeroext i64 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -7794,19 +7021,15 @@ define zeroext i64 @test_masked_vpcmpeqq
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7849,9 +7072,10 @@ define zeroext i64 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpeqq (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -7859,19 +7083,15 @@ define zeroext i64 @test_masked_vpcmpeqq
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7915,17 +7135,25 @@ define zeroext i64 @test_vpcmpeqq_v4i1_v
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -7967,10 +7195,10 @@ define zeroext i64 @test_masked_vpcmpeqq
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpeqq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpeqq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -7978,19 +7206,15 @@ define zeroext i64 @test_masked_vpcmpeqq
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -11329,37 +10553,14 @@ define zeroext i8 @test_vpcmpsgtd_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsgtd_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11380,37 +10581,14 @@ define zeroext i8 @test_vpcmpsgtd_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsgtd_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11433,50 +10611,15 @@ define zeroext i8 @test_masked_vpcmpsgtd
;
; NoVLX-LABEL: test_masked_vpcmpsgtd_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11501,50 +10644,15 @@ define zeroext i8 @test_masked_vpcmpsgtd
;
; NoVLX-LABEL: test_masked_vpcmpsgtd_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11570,38 +10678,14 @@ define zeroext i8 @test_vpcmpsgtd_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsgtd_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11625,51 +10709,15 @@ define zeroext i8 @test_masked_vpcmpsgtd
;
; NoVLX-LABEL: test_masked_vpcmpsgtd_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11696,37 +10744,14 @@ define zeroext i16 @test_vpcmpsgtd_v4i1_
;
; NoVLX-LABEL: test_vpcmpsgtd_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11747,37 +10772,14 @@ define zeroext i16 @test_vpcmpsgtd_v4i1_
;
; NoVLX-LABEL: test_vpcmpsgtd_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11800,50 +10802,15 @@ define zeroext i16 @test_masked_vpcmpsgt
;
; NoVLX-LABEL: test_masked_vpcmpsgtd_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11868,50 +10835,15 @@ define zeroext i16 @test_masked_vpcmpsgt
;
; NoVLX-LABEL: test_masked_vpcmpsgtd_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11937,38 +10869,14 @@ define zeroext i16 @test_vpcmpsgtd_v4i1_
;
; NoVLX-LABEL: test_vpcmpsgtd_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -11992,51 +10900,15 @@ define zeroext i16 @test_masked_vpcmpsgt
;
; NoVLX-LABEL: test_masked_vpcmpsgtd_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -12069,13 +10941,25 @@ define zeroext i32 @test_vpcmpsgtd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -12108,13 +10992,25 @@ define zeroext i32 @test_vpcmpsgtd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -12149,8 +11045,10 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -12158,16 +11056,13 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -12206,8 +11101,10 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -12215,16 +11112,13 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -12264,14 +11158,25 @@ define zeroext i32 @test_vpcmpsgtd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -12307,9 +11212,10 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -12317,16 +11223,13 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -12367,16 +11270,27 @@ define zeroext i64 @test_vpcmpsgtd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -12412,16 +11326,27 @@ define zeroext i64 @test_vpcmpsgtd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtd (%rdi), %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -12459,8 +11384,10 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -12468,19 +11395,15 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -12522,8 +11445,10 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -12531,19 +11456,15 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -12586,17 +11507,27 @@ define zeroext i64 @test_vpcmpsgtd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -12635,9 +11566,10 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtd %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -12645,19 +11577,15 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -14264,13 +13192,14 @@ define zeroext i4 @test_vpcmpsgtq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgtq_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -14291,13 +13220,14 @@ define zeroext i4 @test_vpcmpsgtq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgtq_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -14320,20 +13250,15 @@ define zeroext i4 @test_masked_vpcmpsgtq
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -14358,20 +13283,15 @@ define zeroext i4 @test_masked_vpcmpsgtq
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -14397,14 +13317,14 @@ define zeroext i4 @test_vpcmpsgtq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgtq_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -14428,21 +13348,15 @@ define zeroext i4 @test_masked_vpcmpsgtq
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -14470,23 +13384,14 @@ define zeroext i8 @test_vpcmpsgtq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgtq_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14507,23 +13412,14 @@ define zeroext i8 @test_vpcmpsgtq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgtq_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14546,30 +13442,15 @@ define zeroext i8 @test_masked_vpcmpsgtq
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14594,30 +13475,15 @@ define zeroext i8 @test_masked_vpcmpsgtq
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14643,24 +13509,14 @@ define zeroext i8 @test_vpcmpsgtq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgtq_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14684,31 +13540,15 @@ define zeroext i8 @test_masked_vpcmpsgtq
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14735,23 +13575,14 @@ define zeroext i16 @test_vpcmpsgtq_v2i1_
;
; NoVLX-LABEL: test_vpcmpsgtq_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14772,23 +13603,14 @@ define zeroext i16 @test_vpcmpsgtq_v2i1_
;
; NoVLX-LABEL: test_vpcmpsgtq_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14811,30 +13633,15 @@ define zeroext i16 @test_masked_vpcmpsgt
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14859,30 +13666,15 @@ define zeroext i16 @test_masked_vpcmpsgt
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14908,24 +13700,14 @@ define zeroext i16 @test_vpcmpsgtq_v2i1_
;
; NoVLX-LABEL: test_vpcmpsgtq_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -14949,31 +13731,15 @@ define zeroext i16 @test_masked_vpcmpsgt
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -15006,13 +13772,19 @@ define zeroext i32 @test_vpcmpsgtq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -15045,13 +13817,19 @@ define zeroext i32 @test_vpcmpsgtq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -15086,19 +13864,18 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -15137,19 +13914,18 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -15189,14 +13965,19 @@ define zeroext i32 @test_vpcmpsgtq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -15232,20 +14013,18 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -15286,16 +14065,21 @@ define zeroext i64 @test_vpcmpsgtq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -15331,16 +14115,21 @@ define zeroext i64 @test_vpcmpsgtq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq (%rdi), %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -15378,22 +14167,20 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -15435,22 +14222,20 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -15493,17 +14278,21 @@ define zeroext i64 @test_vpcmpsgtq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -15542,23 +14331,20 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -15597,36 +14383,11 @@ define zeroext i8 @test_vpcmpsgtq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsgtq_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -15651,36 +14412,11 @@ define zeroext i8 @test_vpcmpsgtq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsgtq_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -15707,49 +14443,12 @@ define zeroext i8 @test_masked_vpcmpsgtq
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -15778,49 +14477,12 @@ define zeroext i8 @test_masked_vpcmpsgtq
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -15850,37 +14512,11 @@ define zeroext i8 @test_vpcmpsgtq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsgtq_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -15908,50 +14544,12 @@ define zeroext i8 @test_masked_vpcmpsgtq
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -15982,36 +14580,11 @@ define zeroext i16 @test_vpcmpsgtq_v4i1_
;
; NoVLX-LABEL: test_vpcmpsgtq_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -16036,36 +14609,11 @@ define zeroext i16 @test_vpcmpsgtq_v4i1_
;
; NoVLX-LABEL: test_vpcmpsgtq_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -16092,49 +14640,12 @@ define zeroext i16 @test_masked_vpcmpsgt
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -16163,49 +14674,12 @@ define zeroext i16 @test_masked_vpcmpsgt
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -16235,37 +14709,11 @@ define zeroext i16 @test_vpcmpsgtq_v4i1_
;
; NoVLX-LABEL: test_vpcmpsgtq_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -16293,50 +14741,12 @@ define zeroext i16 @test_masked_vpcmpsgt
;
; NoVLX-LABEL: test_masked_vpcmpsgtq_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -16373,13 +14783,23 @@ define zeroext i32 @test_vpcmpsgtq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16415,13 +14835,23 @@ define zeroext i32 @test_vpcmpsgtq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16459,9 +14889,10 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -16469,16 +14900,13 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16518,9 +14946,10 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -16528,16 +14957,13 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16578,14 +15004,23 @@ define zeroext i32 @test_vpcmpsgtq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16624,10 +15059,10 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -16635,16 +15070,13 @@ define zeroext i32 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16686,16 +15118,25 @@ define zeroext i64 @test_vpcmpsgtq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16734,16 +15175,25 @@ define zeroext i64 @test_vpcmpsgtq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16784,9 +15234,10 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -16794,19 +15245,15 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16849,9 +15296,10 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -16859,19 +15307,15 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16915,17 +15359,25 @@ define zeroext i64 @test_vpcmpsgtq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -16967,10 +15419,10 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpgtq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -16978,19 +15430,15 @@ define zeroext i64 @test_masked_vpcmpsgt
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -20461,39 +18909,14 @@ define zeroext i8 @test_vpcmpsged_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsged_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -20514,40 +18937,14 @@ define zeroext i8 @test_vpcmpsged_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsged_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -20570,50 +18967,15 @@ define zeroext i8 @test_masked_vpcmpsged
;
; NoVLX-LABEL: test_masked_vpcmpsged_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -20638,51 +19000,15 @@ define zeroext i8 @test_masked_vpcmpsged
;
; NoVLX-LABEL: test_masked_vpcmpsged_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -20708,40 +19034,14 @@ define zeroext i8 @test_vpcmpsged_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsged_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -20765,51 +19065,15 @@ define zeroext i8 @test_masked_vpcmpsged
;
; NoVLX-LABEL: test_masked_vpcmpsged_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -20836,39 +19100,14 @@ define zeroext i16 @test_vpcmpsged_v4i1_
;
; NoVLX-LABEL: test_vpcmpsged_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -20889,40 +19128,14 @@ define zeroext i16 @test_vpcmpsged_v4i1_
;
; NoVLX-LABEL: test_vpcmpsged_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -20945,50 +19158,15 @@ define zeroext i16 @test_masked_vpcmpsge
;
; NoVLX-LABEL: test_masked_vpcmpsged_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -21013,51 +19191,15 @@ define zeroext i16 @test_masked_vpcmpsge
;
; NoVLX-LABEL: test_masked_vpcmpsged_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -21083,40 +19225,14 @@ define zeroext i16 @test_vpcmpsged_v4i1_
;
; NoVLX-LABEL: test_vpcmpsged_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -21140,51 +19256,15 @@ define zeroext i16 @test_masked_vpcmpsge
;
; NoVLX-LABEL: test_masked_vpcmpsged_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -21217,14 +19297,23 @@ define zeroext i32 @test_vpcmpsged_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21259,15 +19348,23 @@ define zeroext i32 @test_vpcmpsged_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21304,8 +19401,10 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -21313,16 +19412,13 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21361,9 +19457,10 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -21371,16 +19468,13 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21420,15 +19514,23 @@ define zeroext i32 @test_vpcmpsged_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21466,9 +19568,10 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -21476,16 +19579,13 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21526,17 +19626,25 @@ define zeroext i64 @test_vpcmpsged_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21574,18 +19682,25 @@ define zeroext i64 @test_vpcmpsged_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21625,8 +19740,10 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -21634,19 +19751,15 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21688,9 +19801,10 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -21698,19 +19812,15 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21753,18 +19863,25 @@ define zeroext i64 @test_vpcmpsged_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -21805,9 +19922,10 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpled %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -21815,19 +19933,15 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -23434,15 +21548,14 @@ define zeroext i4 @test_vpcmpsgeq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgeq_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -23463,16 +21576,14 @@ define zeroext i4 @test_vpcmpsgeq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgeq_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -23495,20 +21606,15 @@ define zeroext i4 @test_masked_vpcmpsgeq
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -23533,21 +21639,15 @@ define zeroext i4 @test_masked_vpcmpsgeq
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -23573,16 +21673,14 @@ define zeroext i4 @test_vpcmpsgeq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgeq_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -23606,21 +21704,15 @@ define zeroext i4 @test_masked_vpcmpsgeq
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -23648,25 +21740,14 @@ define zeroext i8 @test_vpcmpsgeq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgeq_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -23687,26 +21768,14 @@ define zeroext i8 @test_vpcmpsgeq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgeq_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -23729,30 +21798,15 @@ define zeroext i8 @test_masked_vpcmpsgeq
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -23777,31 +21831,15 @@ define zeroext i8 @test_masked_vpcmpsgeq
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -23827,26 +21865,14 @@ define zeroext i8 @test_vpcmpsgeq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpsgeq_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -23870,31 +21896,15 @@ define zeroext i8 @test_masked_vpcmpsgeq
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -23921,25 +21931,14 @@ define zeroext i16 @test_vpcmpsgeq_v2i1_
;
; NoVLX-LABEL: test_vpcmpsgeq_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -23960,26 +21959,14 @@ define zeroext i16 @test_vpcmpsgeq_v2i1_
;
; NoVLX-LABEL: test_vpcmpsgeq_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -24002,30 +21989,15 @@ define zeroext i16 @test_masked_vpcmpsge
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -24050,31 +22022,15 @@ define zeroext i16 @test_masked_vpcmpsge
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -24100,26 +22056,14 @@ define zeroext i16 @test_vpcmpsgeq_v2i1_
;
; NoVLX-LABEL: test_vpcmpsgeq_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -24143,31 +22087,15 @@ define zeroext i16 @test_masked_vpcmpsge
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -24200,14 +22128,17 @@ define zeroext i32 @test_vpcmpsgeq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24242,15 +22173,17 @@ define zeroext i32 @test_vpcmpsgeq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24287,19 +22220,18 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24338,20 +22270,18 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24391,15 +22321,17 @@ define zeroext i32 @test_vpcmpsgeq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24437,20 +22369,18 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24491,17 +22421,19 @@ define zeroext i64 @test_vpcmpsgeq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24539,18 +22471,19 @@ define zeroext i64 @test_vpcmpsgeq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24590,22 +22523,20 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24647,23 +22578,20 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24706,18 +22634,19 @@ define zeroext i64 @test_vpcmpsgeq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpcmpeqd %xmm1, %xmm1, %xmm1
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24758,23 +22687,20 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpandn %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -24813,38 +22739,11 @@ define zeroext i8 @test_vpcmpsgeq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsgeq_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -24869,39 +22768,11 @@ define zeroext i8 @test_vpcmpsgeq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsgeq_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -24928,51 +22799,12 @@ define zeroext i8 @test_masked_vpcmpsgeq
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25001,52 +22833,12 @@ define zeroext i8 @test_masked_vpcmpsgeq
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25076,39 +22868,11 @@ define zeroext i8 @test_vpcmpsgeq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpsgeq_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25136,52 +22900,12 @@ define zeroext i8 @test_masked_vpcmpsgeq
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25212,38 +22936,11 @@ define zeroext i16 @test_vpcmpsgeq_v4i1_
;
; NoVLX-LABEL: test_vpcmpsgeq_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25268,39 +22965,11 @@ define zeroext i16 @test_vpcmpsgeq_v4i1_
;
; NoVLX-LABEL: test_vpcmpsgeq_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25327,51 +22996,12 @@ define zeroext i16 @test_masked_vpcmpsge
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25400,52 +23030,12 @@ define zeroext i16 @test_masked_vpcmpsge
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25475,39 +23065,11 @@ define zeroext i16 @test_vpcmpsgeq_v4i1_
;
; NoVLX-LABEL: test_vpcmpsgeq_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25535,52 +23097,12 @@ define zeroext i16 @test_masked_vpcmpsge
;
; NoVLX-LABEL: test_masked_vpcmpsgeq_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -25617,15 +23139,23 @@ define zeroext i32 @test_vpcmpsgeq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -25661,16 +23191,23 @@ define zeroext i32 @test_vpcmpsgeq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -25708,11 +23245,10 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -25720,16 +23256,13 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -25769,12 +23302,10 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -25782,16 +23313,13 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -25832,16 +23360,23 @@ define zeroext i32 @test_vpcmpsgeq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -25880,12 +23415,10 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -25893,16 +23426,13 @@ define zeroext i32 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -25944,18 +23474,25 @@ define zeroext i64 @test_vpcmpsgeq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -25994,19 +23531,25 @@ define zeroext i64 @test_vpcmpsgeq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -26047,11 +23590,10 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -26059,19 +23601,15 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -26114,12 +23652,10 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -26127,19 +23663,15 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -26183,19 +23715,25 @@ define zeroext i64 @test_vpcmpsgeq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -26237,12 +23775,10 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpcmpeqd %ymm1, %ymm1, %ymm1
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpleq %zmm0, %zmm1, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -26250,19 +23786,15 @@ define zeroext i64 @test_masked_vpcmpsge
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -29717,40 +27249,14 @@ define zeroext i8 @test_vpcmpultd_v4i1_v
;
; NoVLX-LABEL: test_vpcmpultd_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -29771,40 +27277,14 @@ define zeroext i8 @test_vpcmpultd_v4i1_v
;
; NoVLX-LABEL: test_vpcmpultd_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rdi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -29827,53 +27307,15 @@ define zeroext i8 @test_masked_vpcmpultd
;
; NoVLX-LABEL: test_masked_vpcmpultd_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -29898,53 +27340,15 @@ define zeroext i8 @test_masked_vpcmpultd
;
; NoVLX-LABEL: test_masked_vpcmpultd_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rsi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -29970,41 +27374,14 @@ define zeroext i8 @test_vpcmpultd_v4i1_v
;
; NoVLX-LABEL: test_vpcmpultd_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -30028,54 +27405,15 @@ define zeroext i8 @test_masked_vpcmpultd
;
; NoVLX-LABEL: test_masked_vpcmpultd_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -30102,40 +27440,14 @@ define zeroext i16 @test_vpcmpultd_v4i1_
;
; NoVLX-LABEL: test_vpcmpultd_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -30156,40 +27468,14 @@ define zeroext i16 @test_vpcmpultd_v4i1_
;
; NoVLX-LABEL: test_vpcmpultd_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rdi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -30212,53 +27498,15 @@ define zeroext i16 @test_masked_vpcmpult
;
; NoVLX-LABEL: test_masked_vpcmpultd_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -30283,53 +27531,15 @@ define zeroext i16 @test_masked_vpcmpult
;
; NoVLX-LABEL: test_masked_vpcmpultd_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rsi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -30355,41 +27565,14 @@ define zeroext i16 @test_vpcmpultd_v4i1_
;
; NoVLX-LABEL: test_vpcmpultd_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -30413,54 +27596,15 @@ define zeroext i16 @test_masked_vpcmpult
;
; NoVLX-LABEL: test_masked_vpcmpultd_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x i32>
@@ -30493,16 +27637,25 @@ define zeroext i32 @test_vpcmpultd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -30535,16 +27688,25 @@ define zeroext i32 @test_vpcmpultd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rdi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -30579,11 +27741,10 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -30591,16 +27752,13 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -30639,11 +27797,10 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rsi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -30651,16 +27808,13 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -30700,17 +27854,25 @@ define zeroext i32 @test_vpcmpultd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -30746,12 +27908,10 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -30759,16 +27919,13 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -30809,19 +27966,27 @@ define zeroext i64 @test_vpcmpultd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -30857,19 +28022,27 @@ define zeroext i64 @test_vpcmpultd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rdi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -30907,11 +28080,10 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -30919,19 +28091,15 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -30973,11 +28141,10 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rsi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -30985,19 +28152,15 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -31040,20 +28203,27 @@ define zeroext i64 @test_vpcmpultd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rdi), %xmm1
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -31092,12 +28262,10 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastd (%rsi), %xmm1
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -31105,19 +28273,15 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -32724,16 +29888,14 @@ define zeroext i4 @test_vpcmpultq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpultq_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -32754,16 +29916,14 @@ define zeroext i4 @test_vpcmpultq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpultq_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rdi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -32786,23 +29946,15 @@ define zeroext i4 @test_masked_vpcmpultq
;
; NoVLX-LABEL: test_masked_vpcmpultq_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -32827,23 +29979,15 @@ define zeroext i4 @test_masked_vpcmpultq
;
; NoVLX-LABEL: test_masked_vpcmpultq_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rsi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -32869,17 +30013,14 @@ define zeroext i4 @test_vpcmpultq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpultq_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -32903,24 +30044,15 @@ define zeroext i4 @test_masked_vpcmpultq
;
; NoVLX-LABEL: test_masked_vpcmpultq_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -32948,26 +30080,14 @@ define zeroext i8 @test_vpcmpultq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpultq_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -32988,26 +30108,14 @@ define zeroext i8 @test_vpcmpultq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpultq_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rdi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33030,33 +30138,15 @@ define zeroext i8 @test_masked_vpcmpultq
;
; NoVLX-LABEL: test_masked_vpcmpultq_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33081,33 +30171,15 @@ define zeroext i8 @test_masked_vpcmpultq
;
; NoVLX-LABEL: test_masked_vpcmpultq_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rsi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33133,27 +30205,14 @@ define zeroext i8 @test_vpcmpultq_v2i1_v
;
; NoVLX-LABEL: test_vpcmpultq_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33177,34 +30236,15 @@ define zeroext i8 @test_masked_vpcmpultq
;
; NoVLX-LABEL: test_masked_vpcmpultq_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33231,26 +30271,14 @@ define zeroext i16 @test_vpcmpultq_v2i1_
;
; NoVLX-LABEL: test_vpcmpultq_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33271,26 +30299,14 @@ define zeroext i16 @test_vpcmpultq_v2i1_
;
; NoVLX-LABEL: test_vpcmpultq_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rdi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33313,33 +30329,15 @@ define zeroext i16 @test_masked_vpcmpult
;
; NoVLX-LABEL: test_masked_vpcmpultq_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33364,33 +30362,15 @@ define zeroext i16 @test_masked_vpcmpult
;
; NoVLX-LABEL: test_masked_vpcmpultq_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rsi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33416,27 +30396,14 @@ define zeroext i16 @test_vpcmpultq_v2i1_
;
; NoVLX-LABEL: test_vpcmpultq_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33460,34 +30427,15 @@ define zeroext i16 @test_masked_vpcmpult
;
; NoVLX-LABEL: test_masked_vpcmpultq_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x i64>
@@ -33520,16 +30468,19 @@ define zeroext i32 @test_vpcmpultq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -33562,16 +30513,19 @@ define zeroext i32 @test_vpcmpultq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rdi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -33606,22 +30560,18 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -33660,22 +30610,18 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rsi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -33715,17 +30661,19 @@ define zeroext i32 @test_vpcmpultq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -33761,23 +30709,18 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -33818,19 +30761,21 @@ define zeroext i64 @test_vpcmpultq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -33866,19 +30811,21 @@ define zeroext i64 @test_vpcmpultq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rdi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %xmm1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -33916,25 +30863,20 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -33976,25 +30918,20 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor (%rsi), %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %xmm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -34037,20 +30974,21 @@ define zeroext i64 @test_vpcmpultq_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -34089,26 +31027,20 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %xmm1
-; NoVLX-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpxor %xmm2, %xmm1, %xmm1
-; NoVLX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kmovw %k0, %ecx
-; NoVLX-NEXT: vmovd %ecx, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -34147,39 +31079,11 @@ define zeroext i8 @test_vpcmpultq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpultq_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34204,39 +31108,11 @@ define zeroext i8 @test_vpcmpultq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpultq_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm1 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor (%rdi), %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34263,52 +31139,12 @@ define zeroext i8 @test_masked_vpcmpultq
;
; NoVLX-LABEL: test_masked_vpcmpultq_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34337,52 +31173,12 @@ define zeroext i8 @test_masked_vpcmpultq
;
; NoVLX-LABEL: test_masked_vpcmpultq_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm1 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor (%rsi), %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34412,40 +31208,11 @@ define zeroext i8 @test_vpcmpultq_v4i1_v
;
; NoVLX-LABEL: test_vpcmpultq_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34473,53 +31240,12 @@ define zeroext i8 @test_masked_vpcmpultq
;
; NoVLX-LABEL: test_masked_vpcmpultq_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34550,39 +31276,11 @@ define zeroext i16 @test_vpcmpultq_v4i1_
;
; NoVLX-LABEL: test_vpcmpultq_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34607,39 +31305,11 @@ define zeroext i16 @test_vpcmpultq_v4i1_
;
; NoVLX-LABEL: test_vpcmpultq_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm1 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor (%rdi), %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34666,52 +31336,12 @@ define zeroext i16 @test_masked_vpcmpult
;
; NoVLX-LABEL: test_masked_vpcmpultq_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34740,52 +31370,12 @@ define zeroext i16 @test_masked_vpcmpult
;
; NoVLX-LABEL: test_masked_vpcmpultq_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm1 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor (%rsi), %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34815,40 +31405,11 @@ define zeroext i16 @test_vpcmpultq_v4i1_
;
; NoVLX-LABEL: test_vpcmpultq_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34876,53 +31437,12 @@ define zeroext i16 @test_masked_vpcmpult
;
; NoVLX-LABEL: test_masked_vpcmpultq_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %eax
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %ecx
-; NoVLX-NEXT: kshiftrw $1, %k0, %k1
-; NoVLX-NEXT: kmovw %k1, %edx
-; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -34959,16 +31479,23 @@ define zeroext i32 @test_vpcmpultq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35004,16 +31531,23 @@ define zeroext i32 @test_vpcmpultq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm1 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor (%rdi), %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35051,12 +31585,10 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -35064,16 +31596,13 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35113,12 +31642,10 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm1 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor (%rsi), %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -35126,16 +31653,13 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35176,17 +31700,23 @@ define zeroext i32 @test_vpcmpultq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35225,13 +31755,10 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -35239,16 +31766,13 @@ define zeroext i32 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35290,19 +31814,25 @@ define zeroext i64 @test_vpcmpultq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35341,19 +31871,25 @@ define zeroext i64 @test_vpcmpultq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm1 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor (%rdi), %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rdi), %ymm1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35394,12 +31930,10 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -35407,19 +31941,15 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35462,12 +31992,10 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm1 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor (%rsi), %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovdqa (%rsi), %ymm1
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -35475,19 +32003,15 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35531,20 +32055,25 @@ define zeroext i64 @test_vpcmpultq_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rdi), %ymm1
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -35586,13 +32115,10 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vpbroadcastq (%rsi), %ymm1
-; NoVLX-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808]
-; NoVLX-NEXT: vpxor %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpxor %ymm2, %ymm1, %ymm1
-; NoVLX-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: kmovw %edi, %k0
+; NoVLX-NEXT: kmovw %edi, %k1
+; NoVLX-NEXT: vpcmpltuq %zmm1, %zmm0, %k0 {%k1}
; NoVLX-NEXT: kshiftrw $3, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %eax
; NoVLX-NEXT: kshiftrw $2, %k0, %k1
@@ -35600,19 +32126,15 @@ define zeroext i64 @test_masked_vpcmpult
; NoVLX-NEXT: kshiftrw $1, %k0, %k1
; NoVLX-NEXT: kmovw %k1, %edx
; NoVLX-NEXT: kmovw %k0, %esi
-; NoVLX-NEXT: vmovd %esi, %xmm1
-; NoVLX-NEXT: vpinsrb $4, %edx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $8, %ecx, %xmm1, %xmm1
-; NoVLX-NEXT: vpinsrb $12, %eax, %xmm1, %xmm1
-; NoVLX-NEXT: vpand %xmm0, %xmm1, %xmm0
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -36635,37 +33157,14 @@ define zeroext i8 @test_vcmpoeqps_v4i1_v
;
; NoVLX-LABEL: test_vcmpoeqps_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x float>
@@ -36686,37 +33185,14 @@ define zeroext i8 @test_vcmpoeqps_v4i1_v
;
; NoVLX-LABEL: test_vcmpoeqps_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqps (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovaps (%rdi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x float>
@@ -36738,38 +33214,14 @@ define zeroext i8 @test_vcmpoeqps_v4i1_v
;
; NoVLX-LABEL: test_vcmpoeqps_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vbroadcastss (%rdi), %xmm1
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x float>
@@ -36793,38 +33245,12 @@ define zeroext i8 @test_masked_vcmpoeqps
;
; NoVLX-LABEL: test_masked_vcmpoeqps_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -36851,38 +33277,12 @@ define zeroext i8 @test_masked_vcmpoeqps
;
; NoVLX-LABEL: test_masked_vcmpoeqps_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqps (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vmovaps (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -36910,39 +33310,12 @@ define zeroext i8 @test_masked_vcmpoeqps
;
; NoVLX-LABEL: test_masked_vcmpoeqps_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vbroadcastss (%rsi), %xmm2
-; NoVLX-NEXT: vcmpeqps %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vbroadcastss (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -36972,37 +33345,14 @@ define zeroext i16 @test_vcmpoeqps_v4i1_
;
; NoVLX-LABEL: test_vcmpoeqps_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x float>
@@ -37023,37 +33373,14 @@ define zeroext i16 @test_vcmpoeqps_v4i1_
;
; NoVLX-LABEL: test_vcmpoeqps_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqps (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovaps (%rdi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x float>
@@ -37075,38 +33402,14 @@ define zeroext i16 @test_vcmpoeqps_v4i1_
;
; NoVLX-LABEL: test_vcmpoeqps_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vbroadcastss (%rdi), %xmm1
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <4 x float>
@@ -37130,38 +33433,12 @@ define zeroext i16 @test_masked_vcmpoeqp
;
; NoVLX-LABEL: test_masked_vcmpoeqps_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -37188,38 +33465,12 @@ define zeroext i16 @test_masked_vcmpoeqp
;
; NoVLX-LABEL: test_masked_vcmpoeqps_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqps (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vmovaps (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -37247,39 +33498,12 @@ define zeroext i16 @test_masked_vcmpoeqp
;
; NoVLX-LABEL: test_masked_vcmpoeqps_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vbroadcastss (%rsi), %xmm2
-; NoVLX-NEXT: vcmpeqps %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vbroadcastss (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -37315,13 +33539,25 @@ define zeroext i32 @test_vcmpoeqps_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -37354,13 +33590,25 @@ define zeroext i32 @test_vcmpoeqps_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vcmpeqps (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovaps (%rdi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -37394,14 +33642,25 @@ define zeroext i32 @test_vcmpoeqps_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vbroadcastss (%rdi), %xmm1
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -37437,15 +33696,24 @@ define zeroext i32 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -37483,15 +33751,24 @@ define zeroext i32 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqps (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vmovaps (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -37530,16 +33807,24 @@ define zeroext i32 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vbroadcastss (%rsi), %xmm2
-; NoVLX-NEXT: vcmpeqps %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vbroadcastss (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -37580,16 +33865,27 @@ define zeroext i64 @test_vcmpoeqps_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -37625,16 +33921,27 @@ define zeroext i64 @test_vcmpoeqps_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vcmpeqps (%rdi), %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovaps (%rdi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -37671,17 +33978,27 @@ define zeroext i64 @test_vcmpoeqps_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vbroadcastss (%rdi), %xmm1
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -37720,18 +34037,26 @@ define zeroext i64 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqps %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm2, %xmm0, %xmm0
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -37772,18 +34097,26 @@ define zeroext i64 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqps (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vmovaps (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -37825,19 +34158,26 @@ define zeroext i64 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vbroadcastss (%rsi), %xmm2
-; NoVLX-NEXT: vcmpeqps %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vandps %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vbroadcastss (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqps %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -39538,13 +35878,14 @@ define zeroext i4 @test_vcmpoeqpd_v2i1_v
;
; NoVLX-LABEL: test_vcmpoeqpd_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -39565,13 +35906,14 @@ define zeroext i4 @test_vcmpoeqpd_v2i1_v
;
; NoVLX-LABEL: test_vcmpoeqpd_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovapd (%rdi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -39593,14 +35935,14 @@ define zeroext i4 @test_vcmpoeqpd_v2i1_v
;
; NoVLX-LABEL: test_vcmpoeqpd_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -39624,16 +35966,15 @@ define zeroext i4 @test_masked_vcmpoeqpd
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v2i1_v4i1_mask:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -39657,16 +35998,15 @@ define zeroext i4 @test_masked_vcmpoeqpd
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v2i1_v4i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: vmovapd (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -39691,17 +36031,15 @@ define zeroext i4 @test_masked_vcmpoeqpd
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v2i1_v4i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vmovddup {{.*#+}} xmm2 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; NoVLX-NEXT: vpslld $31, %ymm0, %ymm0
-; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
+; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
+; NoVLX-NEXT: kshiftrw $14, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
-; NoVLX-NEXT: movb %al, -{{[0-9]+}}(%rsp)
-; NoVLX-NEXT: movzbl -{{[0-9]+}}(%rsp), %eax
+; NoVLX-NEXT: andb $3, %al
+; NoVLX-NEXT: movzbl %al, %eax
; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
@@ -39729,23 +36067,14 @@ define zeroext i8 @test_vcmpoeqpd_v2i1_v
;
; NoVLX-LABEL: test_vcmpoeqpd_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x double>
@@ -39766,23 +36095,14 @@ define zeroext i8 @test_vcmpoeqpd_v2i1_v
;
; NoVLX-LABEL: test_vcmpoeqpd_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovapd (%rdi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x double>
@@ -39804,24 +36124,14 @@ define zeroext i8 @test_vcmpoeqpd_v2i1_v
;
; NoVLX-LABEL: test_vcmpoeqpd_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x double>
@@ -39845,24 +36155,12 @@ define zeroext i8 @test_masked_vcmpoeqpd
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v2i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -39889,24 +36187,12 @@ define zeroext i8 @test_masked_vcmpoeqpd
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v2i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vmovapd (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -39934,25 +36220,12 @@ define zeroext i8 @test_masked_vcmpoeqpd
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v2i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vmovddup {{.*#+}} xmm2 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -39982,23 +36255,14 @@ define zeroext i16 @test_vcmpoeqpd_v2i1_
;
; NoVLX-LABEL: test_vcmpoeqpd_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x double>
@@ -40019,23 +36283,14 @@ define zeroext i16 @test_vcmpoeqpd_v2i1_
;
; NoVLX-LABEL: test_vcmpoeqpd_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovapd (%rdi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x double>
@@ -40057,24 +36312,14 @@ define zeroext i16 @test_vcmpoeqpd_v2i1_
;
; NoVLX-LABEL: test_vcmpoeqpd_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
entry:
%0 = bitcast <2 x i64> %__a to <2 x double>
@@ -40098,24 +36343,12 @@ define zeroext i16 @test_masked_vcmpoeqp
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v2i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -40142,24 +36375,12 @@ define zeroext i16 @test_masked_vcmpoeqp
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v2i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vmovapd (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -40187,25 +36408,12 @@ define zeroext i16 @test_masked_vcmpoeqp
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v2i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vmovddup {{.*#+}} xmm2 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
+; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $14, %k0, %k0
; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -40241,13 +36449,19 @@ define zeroext i32 @test_vcmpoeqpd_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -40280,13 +36494,19 @@ define zeroext i32 @test_vcmpoeqpd_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vcmpeqpd (%rdi), %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovapd (%rdi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -40320,14 +36540,19 @@ define zeroext i32 @test_vcmpoeqpd_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl (%rsp), %eax
@@ -40363,15 +36588,18 @@ define zeroext i32 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -40409,15 +36637,18 @@ define zeroext i32 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vmovapd (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -40456,16 +36687,18 @@ define zeroext i32 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vmovddup {{.*#+}} xmm2 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -40506,16 +36739,21 @@ define zeroext i64 @test_vcmpoeqpd_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -40551,16 +36789,21 @@ define zeroext i64 @test_vcmpoeqpd_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vcmpeqpd (%rdi), %xmm0, %xmm0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vmovapd (%rdi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -40597,17 +36840,21 @@ define zeroext i64 @test_vcmpoeqpd_v2i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
+; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
; NoVLX-NEXT: kmovw %k0, (%rsp)
; NoVLX-NEXT: movl {{[0-9]+}}(%rsp), %ecx
@@ -40646,18 +36893,20 @@ define zeroext i64 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm2, %xmm0, %xmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -40698,18 +36947,20 @@ define zeroext i64 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd (%rsi), %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vmovapd (%rsi), %xmm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -40751,19 +37002,20 @@ define zeroext i64 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogq $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vmovddup {{.*#+}} xmm2 = mem[0,0]
-; NoVLX-NEXT: vcmpeqpd %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vandpd %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vmovddup {{.*#+}} xmm1 = mem[0,0]
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kmovw %k0, %ecx
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,8],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -40802,36 +37054,11 @@ define zeroext i8 @test_vcmpoeqpd_v4i1_v
;
; NoVLX-LABEL: test_vcmpoeqpd_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -40856,36 +37083,11 @@ define zeroext i8 @test_vcmpoeqpd_v4i1_v
;
; NoVLX-LABEL: test_vcmpoeqpd_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovapd (%rdi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -40911,37 +37113,11 @@ define zeroext i8 @test_vcmpoeqpd_v4i1_v
;
; NoVLX-LABEL: test_vcmpoeqpd_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vbroadcastsd (%rdi), %ymm1
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -40969,39 +37145,12 @@ define zeroext i8 @test_masked_vcmpoeqpd
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v4i1_v8i1_mask:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -41029,39 +37178,12 @@ define zeroext i8 @test_masked_vcmpoeqpd
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v4i1_v8i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vmovapd (%rsi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -41090,40 +37212,12 @@ define zeroext i8 @test_masked_vcmpoeqpd
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v4i1_v8i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vbroadcastsd (%rsi), %ymm2
-; NoVLX-NEXT: vcmpeqpd %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vbroadcastsd (%rsi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
; NoVLX-NEXT: vzeroupper
@@ -41154,36 +37248,11 @@ define zeroext i16 @test_vcmpoeqpd_v4i1_
;
; NoVLX-LABEL: test_vcmpoeqpd_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -41208,36 +37277,11 @@ define zeroext i16 @test_vcmpoeqpd_v4i1_
;
; NoVLX-LABEL: test_vcmpoeqpd_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
-; NoVLX-NEXT: vcmpeqpd (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovapd (%rdi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -41263,37 +37307,11 @@ define zeroext i16 @test_vcmpoeqpd_v4i1_
;
; NoVLX-LABEL: test_vcmpoeqpd_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vbroadcastsd (%rdi), %ymm1
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -41321,39 +37339,12 @@ define zeroext i16 @test_masked_vcmpoeqp
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v4i1_v16i1_mask:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -41381,39 +37372,12 @@ define zeroext i16 @test_masked_vcmpoeqp
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v4i1_v16i1_mask_mem:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vmovapd (%rsi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -41442,40 +37406,12 @@ define zeroext i16 @test_masked_vcmpoeqp
;
; NoVLX-LABEL: test_masked_vcmpoeqpd_v4i1_v16i1_mask_mem_b:
; NoVLX: # %bb.0: # %entry
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vbroadcastsd (%rsi), %ymm2
-; NoVLX-NEXT: vcmpeqpd %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: andl $1, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $1, %k2, %k2
-; NoVLX-NEXT: kshiftlw $1, %k2, %k2
-; NoVLX-NEXT: korw %k1, %k2, %k1
-; NoVLX-NEXT: kshiftrw $1, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $14, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $2, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $13, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $3, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $12, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: vbroadcastsd (%rsi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $12, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %ax killed %ax killed %eax
; NoVLX-NEXT: vzeroupper
@@ -41512,13 +37448,23 @@ define zeroext i32 @test_vcmpoeqpd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41554,13 +37500,23 @@ define zeroext i32 @test_vcmpoeqpd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
-; NoVLX-NEXT: vcmpeqpd (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovapd (%rdi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41597,14 +37553,23 @@ define zeroext i32 @test_vcmpoeqpd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vbroadcastsd (%rdi), %ymm1
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41643,16 +37608,24 @@ define zeroext i32 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm2, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41691,16 +37664,24 @@ define zeroext i32 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vmovapd (%rsi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41740,17 +37721,24 @@ define zeroext i32 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $32, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vbroadcastsd (%rsi), %ymm2
-; NoVLX-NEXT: vcmpeqpd %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
+; NoVLX-NEXT: vbroadcastsd (%rsi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
+; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41792,16 +37780,25 @@ define zeroext i64 @test_vcmpoeqpd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41840,16 +37837,25 @@ define zeroext i64 @test_vcmpoeqpd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
-; NoVLX-NEXT: vcmpeqpd (%rdi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; NoVLX-NEXT: vmovapd (%rdi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41889,17 +37895,25 @@ define zeroext i64 @test_vcmpoeqpd_v4i1_
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: vbroadcastsd (%rdi), %ymm1
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41941,19 +37955,26 @@ define zeroext i64 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd %ymm1, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm2, %xmm0, %xmm0
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -41995,19 +38016,26 @@ define zeroext i64 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vcmpeqpd (%rsi), %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vmovapd (%rsi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -42050,20 +38078,26 @@ define zeroext i64 @test_masked_vcmpoeqp
; NoVLX-NEXT: .cfi_def_cfa_register %rbp
; NoVLX-NEXT: andq $-32, %rsp
; NoVLX-NEXT: subq $64, %rsp
+; NoVLX-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; NoVLX-NEXT: kmovw %edi, %k1
-; NoVLX-NEXT: vpternlogd $255, %zmm1, %zmm1, %zmm1 {%k1} {z}
-; NoVLX-NEXT: vbroadcastsd (%rsi), %ymm2
-; NoVLX-NEXT: vcmpeqpd %ymm2, %ymm0, %ymm0
-; NoVLX-NEXT: vpmovqd %zmm0, %ymm0
-; NoVLX-NEXT: vpand %xmm1, %xmm0, %xmm0
+; NoVLX-NEXT: vbroadcastsd (%rsi), %ymm1
+; NoVLX-NEXT: vcmpeqpd %zmm1, %zmm0, %k0 {%k1}
+; NoVLX-NEXT: kshiftrw $3, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %eax
+; NoVLX-NEXT: kshiftrw $2, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %ecx
+; NoVLX-NEXT: kshiftrw $1, %k0, %k1
+; NoVLX-NEXT: kmovw %k1, %edx
+; NoVLX-NEXT: kmovw %k0, %esi
; NoVLX-NEXT: kxorw %k0, %k0, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
-; NoVLX-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,4,8,12],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
-; NoVLX-NEXT: vextracti128 $1, %ymm0, %xmm1
-; NoVLX-NEXT: vpmovsxbd %xmm1, %zmm1
-; NoVLX-NEXT: vptestmd %zmm1, %zmm1, %k0
; NoVLX-NEXT: kmovw %k0, {{[0-9]+}}(%rsp)
+; NoVLX-NEXT: vxorpd %xmm0, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $0, %esi, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $1, %edx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
+; NoVLX-NEXT: vpinsrb $3, %eax, %xmm0, %xmm0
; NoVLX-NEXT: vpmovsxbd %xmm0, %zmm0
; NoVLX-NEXT: vpslld $31, %zmm0, %zmm0
; NoVLX-NEXT: vptestmd %zmm0, %zmm0, %k0
@@ -43281,40 +39315,14 @@ define i8 @mask_zero_lower(<4 x i32> %a)
;
; NoVLX-LABEL: mask_zero_lower:
; NoVLX: # %bb.0:
-; NoVLX-NEXT: vpbroadcastd {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648]
-; NoVLX-NEXT: vpxor %xmm1, %xmm0, %xmm0
-; NoVLX-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
-; NoVLX-NEXT: vpextrb $4, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k0
-; NoVLX-NEXT: vpextrb $0, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k1
-; NoVLX-NEXT: kxorw %k0, %k0, %k2
-; NoVLX-NEXT: kshiftrw $4, %k2, %k3
-; NoVLX-NEXT: kxorw %k1, %k3, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $11, %k1, %k1
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftrw $5, %k1, %k2
-; NoVLX-NEXT: kxorw %k0, %k2, %k0
-; NoVLX-NEXT: kshiftlw $15, %k0, %k0
-; NoVLX-NEXT: kshiftrw $10, %k0, %k0
-; NoVLX-NEXT: kxorw %k1, %k0, %k0
-; NoVLX-NEXT: kshiftrw $6, %k0, %k1
-; NoVLX-NEXT: vpextrb $8, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $9, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
-; NoVLX-NEXT: kshiftrw $7, %k0, %k1
-; NoVLX-NEXT: vpextrb $12, %xmm0, %eax
-; NoVLX-NEXT: kmovw %eax, %k2
-; NoVLX-NEXT: kxorw %k2, %k1, %k1
-; NoVLX-NEXT: kshiftlw $15, %k1, %k1
-; NoVLX-NEXT: kshiftrw $8, %k1, %k1
-; NoVLX-NEXT: kxorw %k0, %k1, %k0
+; NoVLX-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; NoVLX-NEXT: vpxor %xmm1, %xmm1, %xmm1
+; NoVLX-NEXT: vpcmpltud %zmm1, %zmm0, %k0
+; NoVLX-NEXT: kshiftlw $12, %k0, %k0
+; NoVLX-NEXT: kshiftrw $8, %k0, %k0
; NoVLX-NEXT: kmovw %k0, %eax
; NoVLX-NEXT: # kill: def %al killed %al killed %eax
+; NoVLX-NEXT: vzeroupper
; NoVLX-NEXT: retq
%cmp = icmp ult <4 x i32> %a, zeroinitializer
%concat = shufflevector <4 x i1> %cmp, <4 x i1> zeroinitializer, <8 x i32> <i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3>
Modified: llvm/trunk/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll (original)
+++ llvm/trunk/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll Sun Jan 7 10:20:37 2018
@@ -48,7 +48,6 @@ define <2 x i64> @ext_i2_2i64(i2 %a0) {
;
; AVX512F-LABEL: ext_i2_2i64:
; AVX512F: # %bb.0:
-; AVX512F-NEXT: andb $3, %dil
; AVX512F-NEXT: kmovw %edi, %k1
; AVX512F-NEXT: vpbroadcastq {{.*}}(%rip), %zmm0 {%k1} {z}
; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
@@ -98,7 +97,6 @@ define <4 x i32> @ext_i4_4i32(i4 %a0) {
;
; AVX512F-LABEL: ext_i4_4i32:
; AVX512F: # %bb.0:
-; AVX512F-NEXT: andb $15, %dil
; AVX512F-NEXT: kmovw %edi, %k1
; AVX512F-NEXT: vpbroadcastd {{.*}}(%rip), %zmm0 {%k1} {z}
; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
@@ -289,7 +287,6 @@ define <4 x i64> @ext_i4_4i64(i4 %a0) {
;
; AVX512F-LABEL: ext_i4_4i64:
; AVX512F: # %bb.0:
-; AVX512F-NEXT: andb $15, %dil
; AVX512F-NEXT: kmovw %edi, %k1
; AVX512F-NEXT: vpbroadcastq {{.*}}(%rip), %zmm0 {%k1} {z}
; AVX512F-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
Modified: llvm/trunk/test/CodeGen/X86/compress_expand.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/compress_expand.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/compress_expand.ll (original)
+++ llvm/trunk/test/CodeGen/X86/compress_expand.ll Sun Jan 7 10:20:37 2018
@@ -200,11 +200,9 @@ define void @test10(i64* %base, <4 x i64
; KNL: # %bb.0:
; KNL-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; KNL-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL-NEXT: vpsrad $31, %xmm1, %xmm1
-; KNL-NEXT: vpmovsxdq %xmm1, %ymm1
-; KNL-NEXT: vmovdqa %ymm1, %ymm1
-; KNL-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL-NEXT: vptestmq %zmm1, %zmm1, %k1
+; KNL-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL-NEXT: kshiftlw $12, %k0, %k0
+; KNL-NEXT: kshiftrw $12, %k0, %k1
; KNL-NEXT: vpcompressq %zmm0, (%rdi) {%k1}
; KNL-NEXT: retq
call void @llvm.masked.compressstore.v4i64(<4 x i64> %V, i64* %base, <4 x i1> %mask)
@@ -223,10 +221,9 @@ define void @test11(i64* %base, <2 x i64
; KNL: # %bb.0:
; KNL-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: vpsllq $63, %xmm1, %xmm1
-; KNL-NEXT: vpsraq $63, %zmm1, %zmm1
-; KNL-NEXT: vmovdqa %xmm1, %xmm1
-; KNL-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL-NEXT: vptestmq %zmm1, %zmm1, %k1
+; KNL-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL-NEXT: kshiftlw $14, %k0, %k0
+; KNL-NEXT: kshiftrw $14, %k0, %k1
; KNL-NEXT: vpcompressq %zmm0, (%rdi) {%k1}
; KNL-NEXT: retq
call void @llvm.masked.compressstore.v2i64(<2 x i64> %V, i64* %base, <2 x i1> %mask)
@@ -245,10 +242,9 @@ define void @test12(float* %base, <4 x f
; KNL: # %bb.0:
; KNL-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL-NEXT: vpsrad $31, %xmm1, %xmm1
-; KNL-NEXT: vmovdqa %xmm1, %xmm1
-; KNL-NEXT: vpslld $31, %zmm1, %zmm1
-; KNL-NEXT: vptestmd %zmm1, %zmm1, %k1
+; KNL-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL-NEXT: kshiftlw $12, %k0, %k0
+; KNL-NEXT: kshiftrw $12, %k0, %k1
; KNL-NEXT: vcompressps %zmm0, (%rdi) {%k1}
; KNL-NEXT: retq
call void @llvm.masked.compressstore.v4f32(<4 x float> %V, float* %base, <4 x i1> %mask)
@@ -269,11 +265,9 @@ define <2 x float> @test13(float* %base,
; KNL-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: vpxor %xmm2, %xmm2, %xmm2
; KNL-NEXT: vpblendd {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3]
-; KNL-NEXT: vpcmpeqq %xmm2, %xmm1, %xmm1
-; KNL-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL-NEXT: vmovaps %xmm1, %xmm1
-; KNL-NEXT: vpslld $31, %zmm1, %zmm1
-; KNL-NEXT: vptestmd %zmm1, %zmm1, %k1
+; KNL-NEXT: vpcmpeqq %zmm2, %zmm1, %k0
+; KNL-NEXT: kshiftlw $14, %k0, %k0
+; KNL-NEXT: kshiftrw $14, %k0, %k1
; KNL-NEXT: vexpandps (%rdi), %zmm0 {%k1}
; KNL-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
; KNL-NEXT: retq
@@ -296,11 +290,9 @@ define void @test14(float* %base, <2 x f
; KNL-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; KNL-NEXT: vpxor %xmm2, %xmm2, %xmm2
; KNL-NEXT: vpblendd {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3]
-; KNL-NEXT: vpcmpeqq %xmm2, %xmm1, %xmm1
-; KNL-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL-NEXT: vmovaps %xmm1, %xmm1
-; KNL-NEXT: vpslld $31, %zmm1, %zmm1
-; KNL-NEXT: vptestmd %zmm1, %zmm1, %k1
+; KNL-NEXT: vpcmpeqq %zmm2, %zmm1, %k0
+; KNL-NEXT: kshiftlw $14, %k0, %k0
+; KNL-NEXT: kshiftrw $14, %k0, %k1
; KNL-NEXT: vcompressps %zmm0, (%rdi) {%k1}
; KNL-NEXT: retq
%mask = icmp eq <2 x i32> %trigger, zeroinitializer
Modified: llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll (original)
+++ llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll Sun Jan 7 10:20:37 2018
@@ -812,11 +812,12 @@ define <4 x float> @test15(float* %base,
; KNL_64-LABEL: test15:
; KNL_64: # %bb.0:
; KNL_64-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
-; KNL_64-NEXT: vmovdqa %xmm1, %xmm1
-; KNL_64-NEXT: vpmovsxdq %ymm0, %zmm2
-; KNL_64-NEXT: vpslld $31, %ymm1, %ymm0
-; KNL_64-NEXT: vptestmd %zmm0, %zmm0, %k1
-; KNL_64-NEXT: vgatherqps (%rdi,%zmm2,4), %ymm0 {%k1}
+; KNL_64-NEXT: vpslld $31, %xmm1, %xmm1
+; KNL_64-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL_64-NEXT: kshiftlw $12, %k0, %k0
+; KNL_64-NEXT: kshiftrw $12, %k0, %k1
+; KNL_64-NEXT: vpmovsxdq %ymm0, %zmm1
+; KNL_64-NEXT: vgatherqps (%rdi,%zmm1,4), %ymm0 {%k1}
; KNL_64-NEXT: # kill: def %xmm0 killed %xmm0 killed %ymm0
; KNL_64-NEXT: vzeroupper
; KNL_64-NEXT: retq
@@ -824,12 +825,13 @@ define <4 x float> @test15(float* %base,
; KNL_32-LABEL: test15:
; KNL_32: # %bb.0:
; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
-; KNL_32-NEXT: vmovdqa %xmm1, %xmm1
+; KNL_32-NEXT: vpslld $31, %xmm1, %xmm1
+; KNL_32-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL_32-NEXT: kshiftlw $12, %k0, %k0
+; KNL_32-NEXT: kshiftrw $12, %k0, %k1
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vpmovsxdq %ymm0, %zmm2
-; KNL_32-NEXT: vpslld $31, %ymm1, %ymm0
-; KNL_32-NEXT: vptestmd %zmm0, %zmm0, %k1
-; KNL_32-NEXT: vgatherqps (%eax,%zmm2,4), %ymm0 {%k1}
+; KNL_32-NEXT: vpmovsxdq %ymm0, %zmm1
+; KNL_32-NEXT: vgatherqps (%eax,%zmm1,4), %ymm0 {%k1}
; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 killed %ymm0
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
@@ -864,12 +866,10 @@ define <4 x double> @test16(double* %bas
; KNL_64-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
; KNL_64-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
; KNL_64-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL_64-NEXT: vpsrad $31, %xmm1, %xmm1
-; KNL_64-NEXT: vpmovsxdq %xmm1, %ymm1
-; KNL_64-NEXT: vmovdqa %ymm1, %ymm1
+; KNL_64-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL_64-NEXT: kshiftlw $12, %k0, %k0
+; KNL_64-NEXT: kshiftrw $12, %k0, %k1
; KNL_64-NEXT: vpmovsxdq %ymm0, %zmm0
-; KNL_64-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k1
; KNL_64-NEXT: vgatherqpd (%rdi,%zmm0,8), %zmm2 {%k1}
; KNL_64-NEXT: vmovapd %ymm2, %ymm0
; KNL_64-NEXT: retq
@@ -879,13 +879,11 @@ define <4 x double> @test16(double* %bas
; KNL_32-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
; KNL_32-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL_32-NEXT: vpsrad $31, %xmm1, %xmm1
-; KNL_32-NEXT: vpmovsxdq %xmm1, %ymm1
-; KNL_32-NEXT: vmovdqa %ymm1, %ymm1
+; KNL_32-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL_32-NEXT: kshiftlw $12, %k0, %k0
+; KNL_32-NEXT: kshiftrw $12, %k0, %k1
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
; KNL_32-NEXT: vpmovsxdq %ymm0, %zmm0
-; KNL_32-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k1
; KNL_32-NEXT: vgatherqpd (%eax,%zmm0,8), %zmm2 {%k1}
; KNL_32-NEXT: vmovapd %ymm2, %ymm0
; KNL_32-NEXT: retl
@@ -919,9 +917,10 @@ define <2 x double> @test17(double* %bas
; KNL_64-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
; KNL_64-NEXT: vpsllq $32, %xmm0, %xmm0
; KNL_64-NEXT: vpsraq $32, %zmm0, %zmm0
-; KNL_64-NEXT: vmovdqa %xmm1, %xmm1
-; KNL_64-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k1
+; KNL_64-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
; KNL_64-NEXT: vgatherqpd (%rdi,%zmm0,8), %zmm2 {%k1}
; KNL_64-NEXT: vmovapd %xmm2, %xmm0
; KNL_64-NEXT: vzeroupper
@@ -932,10 +931,11 @@ define <2 x double> @test17(double* %bas
; KNL_32-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
; KNL_32-NEXT: vpsllq $32, %xmm0, %xmm0
; KNL_32-NEXT: vpsraq $32, %zmm0, %zmm0
-; KNL_32-NEXT: vmovdqa %xmm1, %xmm1
+; KNL_32-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k1
; KNL_32-NEXT: vgatherqpd (%eax,%zmm0,8), %zmm2 {%k1}
; KNL_32-NEXT: vmovapd %xmm2, %xmm0
; KNL_32-NEXT: vzeroupper
@@ -979,9 +979,10 @@ define void @test18(<4 x i32>%a1, <4 x i
; KNL_64: # %bb.0:
; KNL_64-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
; KNL_64-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
-; KNL_64-NEXT: vmovdqa %xmm2, %xmm2
-; KNL_64-NEXT: vpslld $31, %ymm2, %ymm2
-; KNL_64-NEXT: vptestmd %zmm2, %zmm2, %k1
+; KNL_64-NEXT: vpslld $31, %xmm2, %xmm2
+; KNL_64-NEXT: vptestmd %zmm2, %zmm2, %k0
+; KNL_64-NEXT: kshiftlw $12, %k0, %k0
+; KNL_64-NEXT: kshiftrw $12, %k0, %k1
; KNL_64-NEXT: vpscatterqd %ymm0, (,%zmm1) {%k1}
; KNL_64-NEXT: vzeroupper
; KNL_64-NEXT: retq
@@ -990,10 +991,11 @@ define void @test18(<4 x i32>%a1, <4 x i
; KNL_32: # %bb.0:
; KNL_32-NEXT: # kill: def %xmm1 killed %xmm1 def %ymm1
; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
-; KNL_32-NEXT: vmovdqa %xmm2, %xmm2
+; KNL_32-NEXT: vpslld $31, %xmm2, %xmm2
+; KNL_32-NEXT: vptestmd %zmm2, %zmm2, %k0
+; KNL_32-NEXT: kshiftlw $12, %k0, %k0
+; KNL_32-NEXT: kshiftrw $12, %k0, %k1
; KNL_32-NEXT: vpmovsxdq %ymm1, %zmm1
-; KNL_32-NEXT: vpslld $31, %ymm2, %ymm2
-; KNL_32-NEXT: vptestmd %zmm2, %zmm2, %k1
; KNL_32-NEXT: vpscatterqd %ymm0, (,%zmm1) {%k1}
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
@@ -1022,11 +1024,9 @@ define void @test19(<4 x double>%a1, dou
; KNL_64-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
; KNL_64-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; KNL_64-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL_64-NEXT: vpsrad $31, %xmm1, %xmm1
-; KNL_64-NEXT: vpmovsxdq %xmm1, %ymm1
-; KNL_64-NEXT: vmovdqa %ymm1, %ymm1
-; KNL_64-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k1
+; KNL_64-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL_64-NEXT: kshiftlw $12, %k0, %k0
+; KNL_64-NEXT: kshiftrw $12, %k0, %k1
; KNL_64-NEXT: vscatterqpd %zmm0, (%rdi,%zmm2,8) {%k1}
; KNL_64-NEXT: vzeroupper
; KNL_64-NEXT: retq
@@ -1036,12 +1036,10 @@ define void @test19(<4 x double>%a1, dou
; KNL_32-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
; KNL_32-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; KNL_32-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL_32-NEXT: vpsrad $31, %xmm1, %xmm1
-; KNL_32-NEXT: vpmovsxdq %xmm1, %ymm1
-; KNL_32-NEXT: vmovdqa %ymm1, %ymm1
+; KNL_32-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL_32-NEXT: kshiftlw $12, %k0, %k0
+; KNL_32-NEXT: kshiftrw $12, %k0, %k1
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k1
; KNL_32-NEXT: vscatterqpd %zmm0, (%eax,%zmm2,8) {%k1}
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
@@ -1073,10 +1071,10 @@ define void @test20(<2 x float>%a1, <2 x
; KNL_64: # %bb.0:
; KNL_64-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
; KNL_64-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
-; KNL_64-NEXT: vinsertps {{.*#+}} xmm2 = xmm2[0,2],zero,zero
-; KNL_64-NEXT: vmovaps %xmm2, %xmm2
-; KNL_64-NEXT: vpslld $31, %ymm2, %ymm2
-; KNL_64-NEXT: vptestmd %zmm2, %zmm2, %k1
+; KNL_64-NEXT: vpsllq $63, %xmm2, %xmm2
+; KNL_64-NEXT: vptestmq %zmm2, %zmm2, %k0
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
; KNL_64-NEXT: vscatterqps %ymm0, (,%zmm1) {%k1}
; KNL_64-NEXT: vzeroupper
; KNL_64-NEXT: retq
@@ -1084,12 +1082,12 @@ define void @test20(<2 x float>%a1, <2 x
; KNL_32-LABEL: test20:
; KNL_32: # %bb.0:
; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
+; KNL_32-NEXT: vpsllq $63, %xmm2, %xmm2
+; KNL_32-NEXT: vptestmq %zmm2, %zmm2, %k0
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
; KNL_32-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]
-; KNL_32-NEXT: vinsertps {{.*#+}} xmm2 = xmm2[0,2],zero,zero
-; KNL_32-NEXT: vmovaps %xmm2, %xmm2
; KNL_32-NEXT: vpmovsxdq %ymm1, %zmm1
-; KNL_32-NEXT: vpslld $31, %ymm2, %ymm2
-; KNL_32-NEXT: vptestmd %zmm2, %zmm2, %k1
; KNL_32-NEXT: vscatterqps %ymm0, (,%zmm1) {%k1}
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
@@ -1119,10 +1117,11 @@ define void @test21(<2 x i32>%a1, <2 x i
; KNL_64-LABEL: test21:
; KNL_64: # %bb.0:
; KNL_64-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
-; KNL_64-NEXT: vmovdqa %xmm2, %xmm2
+; KNL_64-NEXT: vpsllq $63, %xmm2, %xmm2
+; KNL_64-NEXT: vptestmq %zmm2, %zmm2, %k0
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
; KNL_64-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
-; KNL_64-NEXT: vpsllq $63, %zmm2, %zmm2
-; KNL_64-NEXT: vptestmq %zmm2, %zmm2, %k1
; KNL_64-NEXT: vpscatterqd %ymm0, (,%zmm1) {%k1}
; KNL_64-NEXT: vzeroupper
; KNL_64-NEXT: retq
@@ -1131,10 +1130,11 @@ define void @test21(<2 x i32>%a1, <2 x i
; KNL_32: # %bb.0:
; KNL_32-NEXT: vpsllq $32, %xmm1, %xmm1
; KNL_32-NEXT: vpsraq $32, %zmm1, %zmm1
-; KNL_32-NEXT: vmovdqa %xmm2, %xmm2
+; KNL_32-NEXT: vpsllq $63, %xmm2, %xmm2
+; KNL_32-NEXT: vptestmq %zmm2, %zmm2, %k0
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
; KNL_32-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
-; KNL_32-NEXT: vpsllq $63, %zmm2, %zmm2
-; KNL_32-NEXT: vptestmq %zmm2, %zmm2, %k1
; KNL_32-NEXT: vpscatterqd %ymm0, (,%zmm1) {%k1}
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
@@ -1170,12 +1170,12 @@ define <2 x float> @test22(float* %base,
; KNL_64-LABEL: test22:
; KNL_64: # %bb.0:
; KNL_64-NEXT: # kill: def %xmm2 killed %xmm2 def %ymm2
+; KNL_64-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
; KNL_64-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
-; KNL_64-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL_64-NEXT: vmovaps %xmm1, %xmm1
; KNL_64-NEXT: vpmovsxdq %ymm0, %zmm0
-; KNL_64-NEXT: vpslld $31, %ymm1, %ymm1
-; KNL_64-NEXT: vptestmd %zmm1, %zmm1, %k1
; KNL_64-NEXT: vgatherqps (%rdi,%zmm0,4), %ymm2 {%k1}
; KNL_64-NEXT: vmovaps %xmm2, %xmm0
; KNL_64-NEXT: vzeroupper
@@ -1184,13 +1184,13 @@ define <2 x float> @test22(float* %base,
; KNL_32-LABEL: test22:
; KNL_32: # %bb.0:
; KNL_32-NEXT: # kill: def %xmm2 killed %xmm2 def %ymm2
+; KNL_32-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
; KNL_32-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
-; KNL_32-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL_32-NEXT: vmovaps %xmm1, %xmm1
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
; KNL_32-NEXT: vpmovsxdq %ymm0, %zmm0
-; KNL_32-NEXT: vpslld $31, %ymm1, %ymm1
-; KNL_32-NEXT: vptestmd %zmm1, %zmm1, %k1
; KNL_32-NEXT: vgatherqps (%eax,%zmm0,4), %ymm2 {%k1}
; KNL_32-NEXT: vmovaps %xmm2, %xmm0
; KNL_32-NEXT: vzeroupper
@@ -1225,10 +1225,10 @@ define <2 x float> @test22a(float* %base
; KNL_64: # %bb.0:
; KNL_64-NEXT: # kill: def %xmm2 killed %xmm2 def %ymm2
; KNL_64-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
-; KNL_64-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL_64-NEXT: vmovaps %xmm1, %xmm1
-; KNL_64-NEXT: vpslld $31, %ymm1, %ymm1
-; KNL_64-NEXT: vptestmd %zmm1, %zmm1, %k1
+; KNL_64-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
; KNL_64-NEXT: vgatherqps (%rdi,%zmm0,4), %ymm2 {%k1}
; KNL_64-NEXT: vmovaps %xmm2, %xmm0
; KNL_64-NEXT: vzeroupper
@@ -1238,11 +1238,11 @@ define <2 x float> @test22a(float* %base
; KNL_32: # %bb.0:
; KNL_32-NEXT: # kill: def %xmm2 killed %xmm2 def %ymm2
; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
-; KNL_32-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL_32-NEXT: vmovaps %xmm1, %xmm1
+; KNL_32-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vpslld $31, %ymm1, %ymm1
-; KNL_32-NEXT: vptestmd %zmm1, %zmm1, %k1
; KNL_32-NEXT: vgatherqps (%eax,%zmm0,4), %ymm2 {%k1}
; KNL_32-NEXT: vmovaps %xmm2, %xmm0
; KNL_32-NEXT: vzeroupper
@@ -1275,30 +1275,30 @@ declare <2 x i64> @llvm.masked.gather.v2
define <2 x i32> @test23(i32* %base, <2 x i32> %ind, <2 x i1> %mask, <2 x i32> %src0) {
; KNL_64-LABEL: test23:
; KNL_64: # %bb.0:
-; KNL_64-NEXT: vpshufd {{.*#+}} xmm2 = xmm2[0,2,2,3]
+; KNL_64-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_64-NEXT: vpshufd {{.*#+}} xmm1 = xmm2[0,2,2,3]
; KNL_64-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
; KNL_64-NEXT: vpmovsxdq %ymm0, %zmm0
-; KNL_64-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL_64-NEXT: vmovaps %xmm1, %xmm1
-; KNL_64-NEXT: vpslld $31, %ymm1, %ymm1
-; KNL_64-NEXT: vptestmd %zmm1, %zmm1, %k1
-; KNL_64-NEXT: vpgatherqd (%rdi,%zmm0,4), %ymm2 {%k1}
-; KNL_64-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm2[0],zero,xmm2[1],zero
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
+; KNL_64-NEXT: vpgatherqd (%rdi,%zmm0,4), %ymm1 {%k1}
+; KNL_64-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm1[0],zero,xmm1[1],zero
; KNL_64-NEXT: vzeroupper
; KNL_64-NEXT: retq
;
; KNL_32-LABEL: test23:
; KNL_32: # %bb.0:
+; KNL_32-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k0
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vpshufd {{.*#+}} xmm2 = xmm2[0,2,2,3]
+; KNL_32-NEXT: vpshufd {{.*#+}} xmm1 = xmm2[0,2,2,3]
; KNL_32-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
; KNL_32-NEXT: vpmovsxdq %ymm0, %zmm0
-; KNL_32-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL_32-NEXT: vmovaps %xmm1, %xmm1
-; KNL_32-NEXT: vpslld $31, %ymm1, %ymm1
-; KNL_32-NEXT: vptestmd %zmm1, %zmm1, %k1
-; KNL_32-NEXT: vpgatherqd (%eax,%zmm0,4), %ymm2 {%k1}
-; KNL_32-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm2[0],zero,xmm2[1],zero
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
+; KNL_32-NEXT: vpgatherqd (%eax,%zmm0,4), %ymm1 {%k1}
+; KNL_32-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm1[0],zero,xmm1[1],zero
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
;
@@ -1332,27 +1332,27 @@ define <2 x i32> @test23b(i32* %base, <2
; KNL_64-LABEL: test23b:
; KNL_64: # %bb.0:
; KNL_64-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
-; KNL_64-NEXT: vpshufd {{.*#+}} xmm2 = xmm2[0,2,2,3]
-; KNL_64-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL_64-NEXT: vmovaps %xmm1, %xmm1
-; KNL_64-NEXT: vpslld $31, %ymm1, %ymm1
-; KNL_64-NEXT: vptestmd %zmm1, %zmm1, %k1
-; KNL_64-NEXT: vpgatherqd (%rdi,%zmm0,4), %ymm2 {%k1}
-; KNL_64-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm2[0],zero,xmm2[1],zero
+; KNL_64-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_64-NEXT: vpshufd {{.*#+}} xmm1 = xmm2[0,2,2,3]
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
+; KNL_64-NEXT: vpgatherqd (%rdi,%zmm0,4), %ymm1 {%k1}
+; KNL_64-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm1[0],zero,xmm1[1],zero
; KNL_64-NEXT: vzeroupper
; KNL_64-NEXT: retq
;
; KNL_32-LABEL: test23b:
; KNL_32: # %bb.0:
; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; KNL_32-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k0
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vpshufd {{.*#+}} xmm2 = xmm2[0,2,2,3]
-; KNL_32-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,2],zero,zero
-; KNL_32-NEXT: vmovaps %xmm1, %xmm1
-; KNL_32-NEXT: vpslld $31, %ymm1, %ymm1
-; KNL_32-NEXT: vptestmd %zmm1, %zmm1, %k1
-; KNL_32-NEXT: vpgatherqd (%eax,%zmm0,4), %ymm2 {%k1}
-; KNL_32-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm2[0],zero,xmm2[1],zero
+; KNL_32-NEXT: vpshufd {{.*#+}} xmm1 = xmm2[0,2,2,3]
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
+; KNL_32-NEXT: vpgatherqd (%eax,%zmm0,4), %ymm1 {%k1}
+; KNL_32-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm1[0],zero,xmm1[1],zero
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
;
@@ -1433,9 +1433,10 @@ define <2 x i64> @test25(i64* %base, <2
; KNL_64-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
; KNL_64-NEXT: vpsllq $32, %xmm0, %xmm0
; KNL_64-NEXT: vpsraq $32, %zmm0, %zmm0
-; KNL_64-NEXT: vmovdqa %xmm1, %xmm1
-; KNL_64-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k1
+; KNL_64-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
; KNL_64-NEXT: vpgatherqq (%rdi,%zmm0,8), %zmm2 {%k1}
; KNL_64-NEXT: vmovdqa %xmm2, %xmm0
; KNL_64-NEXT: vzeroupper
@@ -1446,10 +1447,11 @@ define <2 x i64> @test25(i64* %base, <2
; KNL_32-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
; KNL_32-NEXT: vpsllq $32, %xmm0, %xmm0
; KNL_32-NEXT: vpsraq $32, %zmm0, %zmm0
-; KNL_32-NEXT: vmovdqa %xmm1, %xmm1
+; KNL_32-NEXT: vpsllq $63, %xmm1, %xmm1
+; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k0
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k1
; KNL_32-NEXT: vpgatherqq (%eax,%zmm0,8), %zmm2 {%k1}
; KNL_32-NEXT: vmovdqa %xmm2, %xmm0
; KNL_32-NEXT: vzeroupper
@@ -1500,10 +1502,8 @@ define <2 x i64> @test26(i64* %base, <2
; KNL_32-NEXT: vpsllq $32, %xmm0, %xmm0
; KNL_32-NEXT: vpsraq $32, %zmm0, %zmm0
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vpcmpeqd %xmm2, %xmm2, %xmm2
-; KNL_32-NEXT: vmovdqa %xmm2, %xmm2
-; KNL_32-NEXT: vpsllq $63, %zmm2, %zmm2
-; KNL_32-NEXT: vptestmq %zmm2, %zmm2, %k1
+; KNL_32-NEXT: movb $3, %cl
+; KNL_32-NEXT: kmovw %ecx, %k1
; KNL_32-NEXT: vpgatherqq (%eax,%zmm0,8), %zmm1 {%k1}
; KNL_32-NEXT: vmovdqa %xmm1, %xmm0
; KNL_32-NEXT: vzeroupper
@@ -1597,10 +1597,8 @@ define void @test28(<2 x i32>%a1, <2 x i
; KNL_32-NEXT: vpsllq $32, %xmm1, %xmm1
; KNL_32-NEXT: vpsraq $32, %zmm1, %zmm1
; KNL_32-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
-; KNL_32-NEXT: vpcmpeqd %xmm2, %xmm2, %xmm2
-; KNL_32-NEXT: vmovdqa %xmm2, %xmm2
-; KNL_32-NEXT: vpsllq $63, %zmm2, %zmm2
-; KNL_32-NEXT: vptestmq %zmm2, %zmm2, %k1
+; KNL_32-NEXT: movb $3, %al
+; KNL_32-NEXT: kmovw %eax, %k1
; KNL_32-NEXT: vpscatterqd %ymm0, (,%zmm1) {%k1}
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
@@ -1686,83 +1684,80 @@ declare <3 x i32> @llvm.masked.gather.v3
define <3 x i32> @test30(<3 x i32*> %base, <3 x i32> %ind, <3 x i1> %mask, <3 x i32> %src0) {
; KNL_64-LABEL: test30:
; KNL_64: # %bb.0:
+; KNL_64-NEXT: # kill: def %xmm3 killed %xmm3 def %zmm3
+; KNL_64-NEXT: vpslld $31, %xmm2, %xmm2
+; KNL_64-NEXT: vptestmd %zmm2, %zmm2, %k1
+; KNL_64-NEXT: kmovw %k1, %eax
; KNL_64-NEXT: vpmovsxdq %xmm1, %ymm1
; KNL_64-NEXT: vpsllq $2, %ymm1, %ymm1
; KNL_64-NEXT: vpaddq %ymm1, %ymm0, %ymm1
-; KNL_64-NEXT: testb $1, %dil
+; KNL_64-NEXT: testb $1, %al
; KNL_64-NEXT: # implicit-def: %xmm0
-; KNL_64-NEXT: jne .LBB31_1
-; KNL_64-NEXT: # %bb.2: # %else
-; KNL_64-NEXT: testb $1, %sil
-; KNL_64-NEXT: jne .LBB31_3
-; KNL_64-NEXT: .LBB31_4: # %else2
-; KNL_64-NEXT: testb $1, %dl
-; KNL_64-NEXT: jne .LBB31_5
-; KNL_64-NEXT: .LBB31_6: # %else5
-; KNL_64-NEXT: vmovd %edi, %xmm1
-; KNL_64-NEXT: vpinsrb $4, %esi, %xmm1, %xmm1
-; KNL_64-NEXT: vpinsrb $8, %edx, %xmm1, %xmm1
-; KNL_64-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL_64-NEXT: vblendvps %xmm1, %xmm0, %xmm2, %xmm0
-; KNL_64-NEXT: vzeroupper
-; KNL_64-NEXT: retq
-; KNL_64-NEXT: .LBB31_1: # %cond.load
+; KNL_64-NEXT: je .LBB31_2
+; KNL_64-NEXT: # %bb.1: # %cond.load
; KNL_64-NEXT: vmovq %xmm1, %rax
; KNL_64-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero
-; KNL_64-NEXT: testb $1, %sil
+; KNL_64-NEXT: .LBB31_2: # %else
+; KNL_64-NEXT: kshiftrw $1, %k1, %k0
+; KNL_64-NEXT: kmovw %k0, %eax
+; KNL_64-NEXT: testb $1, %al
; KNL_64-NEXT: je .LBB31_4
-; KNL_64-NEXT: .LBB31_3: # %cond.load1
+; KNL_64-NEXT: # %bb.3: # %cond.load1
; KNL_64-NEXT: vpextrq $1, %xmm1, %rax
; KNL_64-NEXT: vpinsrd $1, (%rax), %xmm0, %xmm0
-; KNL_64-NEXT: testb $1, %dl
+; KNL_64-NEXT: .LBB31_4: # %else2
+; KNL_64-NEXT: kshiftrw $2, %k1, %k0
+; KNL_64-NEXT: kmovw %k0, %eax
+; KNL_64-NEXT: testb $1, %al
; KNL_64-NEXT: je .LBB31_6
-; KNL_64-NEXT: .LBB31_5: # %cond.load4
+; KNL_64-NEXT: # %bb.5: # %cond.load4
; KNL_64-NEXT: vextracti128 $1, %ymm1, %xmm1
; KNL_64-NEXT: vmovq %xmm1, %rax
; KNL_64-NEXT: vpinsrd $2, (%rax), %xmm0, %xmm0
-; KNL_64-NEXT: jmp .LBB31_6
+; KNL_64-NEXT: .LBB31_6: # %else5
+; KNL_64-NEXT: vmovdqa32 %zmm0, %zmm3 {%k1}
+; KNL_64-NEXT: vmovdqa %xmm3, %xmm0
+; KNL_64-NEXT: vzeroupper
+; KNL_64-NEXT: retq
;
; KNL_32-LABEL: test30:
; KNL_32: # %bb.0:
-; KNL_32-NEXT: pushl %esi
-; KNL_32-NEXT: .cfi_def_cfa_offset 8
-; KNL_32-NEXT: .cfi_offset %esi, -8
-; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %edx
+; KNL_32-NEXT: subl $12, %esp
+; KNL_32-NEXT: .cfi_def_cfa_offset 16
+; KNL_32-NEXT: vpslld $31, %xmm2, %xmm2
+; KNL_32-NEXT: vptestmd %zmm2, %zmm2, %k1
+; KNL_32-NEXT: kmovw %k1, %eax
; KNL_32-NEXT: vpslld $2, %xmm1, %xmm1
-; KNL_32-NEXT: vpaddd %xmm1, %xmm0, %xmm1
-; KNL_32-NEXT: testb $1, %dl
-; KNL_32-NEXT: # implicit-def: %xmm0
-; KNL_32-NEXT: jne .LBB31_1
-; KNL_32-NEXT: # %bb.2: # %else
-; KNL_32-NEXT: testb $1, %cl
-; KNL_32-NEXT: jne .LBB31_3
-; KNL_32-NEXT: .LBB31_4: # %else2
+; KNL_32-NEXT: vpaddd %xmm1, %xmm0, %xmm2
+; KNL_32-NEXT: testb $1, %al
+; KNL_32-NEXT: # implicit-def: %xmm1
+; KNL_32-NEXT: je .LBB31_2
+; KNL_32-NEXT: # %bb.1: # %cond.load
+; KNL_32-NEXT: vmovd %xmm2, %eax
+; KNL_32-NEXT: vmovd {{.*#+}} xmm1 = mem[0],zero,zero,zero
+; KNL_32-NEXT: .LBB31_2: # %else
+; KNL_32-NEXT: kshiftrw $1, %k1, %k0
+; KNL_32-NEXT: kmovw %k0, %eax
; KNL_32-NEXT: testb $1, %al
-; KNL_32-NEXT: jne .LBB31_5
-; KNL_32-NEXT: .LBB31_6: # %else5
-; KNL_32-NEXT: vmovd %edx, %xmm1
-; KNL_32-NEXT: vpinsrb $4, %ecx, %xmm1, %xmm1
-; KNL_32-NEXT: vpinsrb $8, %eax, %xmm1, %xmm1
-; KNL_32-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL_32-NEXT: vblendvps %xmm1, %xmm0, %xmm2, %xmm0
-; KNL_32-NEXT: popl %esi
-; KNL_32-NEXT: retl
-; KNL_32-NEXT: .LBB31_1: # %cond.load
-; KNL_32-NEXT: vmovd %xmm1, %esi
-; KNL_32-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero
-; KNL_32-NEXT: testb $1, %cl
; KNL_32-NEXT: je .LBB31_4
-; KNL_32-NEXT: .LBB31_3: # %cond.load1
-; KNL_32-NEXT: vpextrd $1, %xmm1, %esi
-; KNL_32-NEXT: vpinsrd $1, (%esi), %xmm0, %xmm0
+; KNL_32-NEXT: # %bb.3: # %cond.load1
+; KNL_32-NEXT: vpextrd $1, %xmm2, %eax
+; KNL_32-NEXT: vpinsrd $1, (%eax), %xmm1, %xmm1
+; KNL_32-NEXT: .LBB31_4: # %else2
+; KNL_32-NEXT: vmovdqa {{[0-9]+}}(%esp), %xmm0
+; KNL_32-NEXT: kshiftrw $2, %k1, %k0
+; KNL_32-NEXT: kmovw %k0, %eax
; KNL_32-NEXT: testb $1, %al
; KNL_32-NEXT: je .LBB31_6
-; KNL_32-NEXT: .LBB31_5: # %cond.load4
-; KNL_32-NEXT: vpextrd $2, %xmm1, %esi
-; KNL_32-NEXT: vpinsrd $2, (%esi), %xmm0, %xmm0
-; KNL_32-NEXT: jmp .LBB31_6
+; KNL_32-NEXT: # %bb.5: # %cond.load4
+; KNL_32-NEXT: vpextrd $2, %xmm2, %eax
+; KNL_32-NEXT: vpinsrd $2, (%eax), %xmm1, %xmm1
+; KNL_32-NEXT: .LBB31_6: # %else5
+; KNL_32-NEXT: vmovdqa32 %zmm1, %zmm0 {%k1}
+; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; KNL_32-NEXT: addl $12, %esp
+; KNL_32-NEXT: vzeroupper
+; KNL_32-NEXT: retl
;
; SKX-LABEL: test30:
; SKX: # %bb.0:
@@ -2355,11 +2350,9 @@ define <4 x i64> @test_pr28312(<4 x i64*
; KNL_64: # %bb.0:
; KNL_64-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
; KNL_64-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL_64-NEXT: vpsrad $31, %xmm1, %xmm1
-; KNL_64-NEXT: vpmovsxdq %xmm1, %ymm1
-; KNL_64-NEXT: vmovdqa %ymm1, %ymm1
-; KNL_64-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_64-NEXT: vptestmq %zmm1, %zmm1, %k1
+; KNL_64-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL_64-NEXT: kshiftlw $12, %k0, %k0
+; KNL_64-NEXT: kshiftrw $12, %k0, %k1
; KNL_64-NEXT: vpgatherqq (,%zmm0), %zmm1 {%k1}
; KNL_64-NEXT: vpaddq %ymm1, %ymm1, %ymm0
; KNL_64-NEXT: vpaddq %ymm0, %ymm1, %ymm0
@@ -2376,12 +2369,10 @@ define <4 x i64> @test_pr28312(<4 x i64*
; KNL_32-NEXT: subl $32, %esp
; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 def %ymm0
; KNL_32-NEXT: vpslld $31, %xmm1, %xmm1
-; KNL_32-NEXT: vpsrad $31, %xmm1, %xmm1
-; KNL_32-NEXT: vpmovsxdq %xmm1, %ymm1
-; KNL_32-NEXT: vmovdqa %ymm1, %ymm1
+; KNL_32-NEXT: vptestmd %zmm1, %zmm1, %k0
+; KNL_32-NEXT: kshiftlw $12, %k0, %k0
+; KNL_32-NEXT: kshiftrw $12, %k0, %k1
; KNL_32-NEXT: vpmovsxdq %ymm0, %zmm0
-; KNL_32-NEXT: vpsllq $63, %zmm1, %zmm1
-; KNL_32-NEXT: vptestmq %zmm1, %zmm1, %k1
; KNL_32-NEXT: vpgatherqq (,%zmm0), %zmm1 {%k1}
; KNL_32-NEXT: vpaddq %ymm1, %ymm1, %ymm0
; KNL_32-NEXT: vpaddq %ymm0, %ymm1, %ymm0
@@ -2547,14 +2538,14 @@ define <2 x float> @large_index(float* %
; KNL_64-LABEL: large_index:
; KNL_64: # %bb.0:
; KNL_64-NEXT: # kill: def %xmm1 killed %xmm1 def %ymm1
-; KNL_64-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; KNL_64-NEXT: vmovaps %xmm0, %xmm0
-; KNL_64-NEXT: vmovq %rcx, %xmm2
-; KNL_64-NEXT: vmovq %rsi, %xmm3
-; KNL_64-NEXT: vpunpcklqdq {{.*#+}} xmm2 = xmm3[0],xmm2[0]
-; KNL_64-NEXT: vpslld $31, %ymm0, %ymm0
-; KNL_64-NEXT: vptestmd %zmm0, %zmm0, %k1
-; KNL_64-NEXT: vgatherqps (%rdi,%zmm2,4), %ymm1 {%k1}
+; KNL_64-NEXT: vpsllq $63, %xmm0, %xmm0
+; KNL_64-NEXT: vptestmq %zmm0, %zmm0, %k0
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
+; KNL_64-NEXT: vmovq %rcx, %xmm0
+; KNL_64-NEXT: vmovq %rsi, %xmm2
+; KNL_64-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm2[0],xmm0[0]
+; KNL_64-NEXT: vgatherqps (%rdi,%zmm0,4), %ymm1 {%k1}
; KNL_64-NEXT: vmovaps %xmm1, %xmm0
; KNL_64-NEXT: vzeroupper
; KNL_64-NEXT: retq
@@ -2562,16 +2553,16 @@ define <2 x float> @large_index(float* %
; KNL_32-LABEL: large_index:
; KNL_32: # %bb.0:
; KNL_32-NEXT: # kill: def %xmm1 killed %xmm1 def %ymm1
-; KNL_32-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; KNL_32-NEXT: vmovaps %xmm0, %xmm0
+; KNL_32-NEXT: vpsllq $63, %xmm0, %xmm0
+; KNL_32-NEXT: vptestmq %zmm0, %zmm0, %k0
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vmovd {{.*#+}} xmm2 = mem[0],zero,zero,zero
-; KNL_32-NEXT: vpinsrd $1, {{[0-9]+}}(%esp), %xmm2, %xmm2
-; KNL_32-NEXT: vpinsrd $2, {{[0-9]+}}(%esp), %xmm2, %xmm2
-; KNL_32-NEXT: vpinsrd $3, {{[0-9]+}}(%esp), %xmm2, %xmm2
-; KNL_32-NEXT: vpslld $31, %ymm0, %ymm0
-; KNL_32-NEXT: vptestmd %zmm0, %zmm0, %k1
-; KNL_32-NEXT: vgatherqps (%eax,%zmm2,4), %ymm1 {%k1}
+; KNL_32-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero
+; KNL_32-NEXT: vpinsrd $1, {{[0-9]+}}(%esp), %xmm0, %xmm0
+; KNL_32-NEXT: vpinsrd $2, {{[0-9]+}}(%esp), %xmm0, %xmm0
+; KNL_32-NEXT: vpinsrd $3, {{[0-9]+}}(%esp), %xmm0, %xmm0
+; KNL_32-NEXT: vgatherqps (%eax,%zmm0,4), %ymm1 {%k1}
; KNL_32-NEXT: vmovaps %xmm1, %xmm0
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
@@ -2700,9 +2691,10 @@ define void @test_scatter_2i32_index(<2
; KNL_64-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; KNL_64-NEXT: vpsllq $32, %xmm1, %xmm1
; KNL_64-NEXT: vpsraq $32, %zmm1, %zmm1
-; KNL_64-NEXT: vmovdqa %xmm2, %xmm2
-; KNL_64-NEXT: vpsllq $63, %zmm2, %zmm2
-; KNL_64-NEXT: vptestmq %zmm2, %zmm2, %k1
+; KNL_64-NEXT: vpsllq $63, %xmm2, %xmm2
+; KNL_64-NEXT: vptestmq %zmm2, %zmm2, %k0
+; KNL_64-NEXT: kshiftlw $14, %k0, %k0
+; KNL_64-NEXT: kshiftrw $14, %k0, %k1
; KNL_64-NEXT: vscatterqpd %zmm0, (%rdi,%zmm1,8) {%k1}
; KNL_64-NEXT: vzeroupper
; KNL_64-NEXT: retq
@@ -2712,10 +2704,11 @@ define void @test_scatter_2i32_index(<2
; KNL_32-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
; KNL_32-NEXT: vpsllq $32, %xmm1, %xmm1
; KNL_32-NEXT: vpsraq $32, %zmm1, %zmm1
-; KNL_32-NEXT: vmovdqa %xmm2, %xmm2
+; KNL_32-NEXT: vpsllq $63, %xmm2, %xmm2
+; KNL_32-NEXT: vptestmq %zmm2, %zmm2, %k0
+; KNL_32-NEXT: kshiftlw $14, %k0, %k0
+; KNL_32-NEXT: kshiftrw $14, %k0, %k1
; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax
-; KNL_32-NEXT: vpsllq $63, %zmm2, %zmm2
-; KNL_32-NEXT: vptestmq %zmm2, %zmm2, %k1
; KNL_32-NEXT: vscatterqpd %zmm0, (%eax,%zmm1,8) {%k1}
; KNL_32-NEXT: vzeroupper
; KNL_32-NEXT: retl
Modified: llvm/trunk/test/CodeGen/X86/masked_memop.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked_memop.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/masked_memop.ll (original)
+++ llvm/trunk/test/CodeGen/X86/masked_memop.ll Sun Jan 7 10:20:37 2018
@@ -99,10 +99,15 @@ define <2 x double> @test6(<2 x i64> %tr
;
; AVX512F-LABEL: test6:
; AVX512F: ## %bb.0:
+; AVX512F-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
-; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0
-; AVX512F-NEXT: vmaskmovpd (%rdi), %xmm0, %xmm2
-; AVX512F-NEXT: vblendvpd %xmm0, %xmm2, %xmm1, %xmm0
+; AVX512F-NEXT: vpcmpeqq %zmm2, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $14, %k0, %k0
+; AVX512F-NEXT: kshiftrw $14, %k0, %k1
+; AVX512F-NEXT: vblendmpd (%rdi), %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test6:
@@ -127,10 +132,15 @@ define <4 x float> @test7(<4 x i32> %tri
;
; AVX512F-LABEL: test7:
; AVX512F: ## %bb.0:
+; AVX512F-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
-; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0
-; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm2
-; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
+; AVX512F-NEXT: vpcmpeqd %zmm2, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $12, %k0, %k0
+; AVX512F-NEXT: kshiftrw $12, %k0, %k1
+; AVX512F-NEXT: vblendmps (%rdi), %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test7:
@@ -163,10 +173,15 @@ define <4 x i32> @test8(<4 x i32> %trigg
;
; AVX512F-LABEL: test8:
; AVX512F: ## %bb.0:
+; AVX512F-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
-; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0
-; AVX512F-NEXT: vpmaskmovd (%rdi), %xmm0, %xmm2
-; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
+; AVX512F-NEXT: vpcmpeqd %zmm2, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $12, %k0, %k0
+; AVX512F-NEXT: kshiftrw $12, %k0, %k1
+; AVX512F-NEXT: vpblendmd (%rdi), %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test8:
@@ -197,9 +212,14 @@ define void @test9(<4 x i32> %trigger, <
;
; AVX512F-LABEL: test9:
; AVX512F: ## %bb.0:
+; AVX512F-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
-; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0
-; AVX512F-NEXT: vpmaskmovd %xmm1, %xmm0, (%rdi)
+; AVX512F-NEXT: vpcmpeqd %zmm2, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $12, %k0, %k0
+; AVX512F-NEXT: kshiftrw $12, %k0, %k1
+; AVX512F-NEXT: vmovdqu32 %zmm1, (%rdi) {%k1}
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test9:
@@ -237,11 +257,14 @@ define <4 x double> @test10(<4 x i32> %t
;
; AVX512F-LABEL: test10:
; AVX512F: ## %bb.0:
+; AVX512F-NEXT: ## kill: def %ymm1 killed %ymm1 def %zmm1
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
-; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0
-; AVX512F-NEXT: vpmovsxdq %xmm0, %ymm0
-; AVX512F-NEXT: vmaskmovpd (%rdi), %ymm0, %ymm2
-; AVX512F-NEXT: vblendvpd %ymm0, %ymm2, %ymm1, %ymm0
+; AVX512F-NEXT: vpcmpeqd %zmm2, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $12, %k0, %k0
+; AVX512F-NEXT: kshiftrw $12, %k0, %k1
+; AVX512F-NEXT: vblendmpd (%rdi), %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: ## kill: def %ymm0 killed %ymm0 killed %zmm0
; AVX512F-NEXT: retq
;
; SKX-LABEL: test10:
@@ -277,10 +300,13 @@ define <4 x double> @test10b(<4 x i32> %
;
; AVX512F-LABEL: test10b:
; AVX512F: ## %bb.0:
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1
-; AVX512F-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
-; AVX512F-NEXT: vpmovsxdq %xmm0, %ymm0
-; AVX512F-NEXT: vmaskmovpd (%rdi), %ymm0, %ymm0
+; AVX512F-NEXT: vpcmpeqd %zmm1, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $12, %k0, %k0
+; AVX512F-NEXT: kshiftrw $12, %k0, %k1
+; AVX512F-NEXT: vmovupd (%rdi), %zmm0 {%k1} {z}
+; AVX512F-NEXT: ## kill: def %ymm0 killed %ymm0 killed %zmm0
; AVX512F-NEXT: retq
;
; SKX-LABEL: test10b:
@@ -525,11 +551,14 @@ define void @test14(<2 x i32> %trigger,
;
; AVX512F-LABEL: test14:
; AVX512F: ## %bb.0:
+; AVX512F-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
; AVX512F-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]
-; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0
-; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; AVX512F-NEXT: vmaskmovps %xmm1, %xmm0, (%rdi)
+; AVX512F-NEXT: vpcmpeqq %zmm2, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $14, %k0, %k0
+; AVX512F-NEXT: kshiftrw $14, %k0, %k1
+; AVX512F-NEXT: vmovups %zmm1, (%rdi) {%k1}
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test14:
@@ -569,10 +598,12 @@ define void @test15(<2 x i32> %trigger,
; AVX512F: ## %bb.0:
; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
; AVX512F-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]
-; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0
-; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; AVX512F-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]
-; AVX512F-NEXT: vpmaskmovd %xmm1, %xmm0, (%rdi)
+; AVX512F-NEXT: vpcmpeqq %zmm2, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $14, %k0, %k0
+; AVX512F-NEXT: kshiftrw $14, %k0, %k1
+; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm1[0,2,2,3]
+; AVX512F-NEXT: vmovdqu32 %zmm0, (%rdi) {%k1}
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test15:
@@ -610,12 +641,15 @@ define <2 x float> @test16(<2 x i32> %tr
;
; AVX512F-LABEL: test16:
; AVX512F: ## %bb.0:
+; AVX512F-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
; AVX512F-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]
-; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0
-; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm2
-; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
+; AVX512F-NEXT: vpcmpeqq %zmm2, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $14, %k0, %k0
+; AVX512F-NEXT: kshiftrw $14, %k0, %k1
+; AVX512F-NEXT: vblendmps (%rdi), %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test16:
@@ -659,12 +693,13 @@ define <2 x i32> @test17(<2 x i32> %trig
; AVX512F: ## %bb.0:
; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
; AVX512F-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]
-; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0
-; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; AVX512F-NEXT: vpmaskmovd (%rdi), %xmm0, %xmm2
-; AVX512F-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[0,2,2,3]
-; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
+; AVX512F-NEXT: vpcmpeqq %zmm2, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $14, %k0, %k0
+; AVX512F-NEXT: kshiftrw $14, %k0, %k1
+; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm1[0,2,2,3]
+; AVX512F-NEXT: vmovdqu32 (%rdi), %zmm0 {%k1}
; AVX512F-NEXT: vpmovsxdq %xmm0, %xmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test17:
@@ -704,9 +739,12 @@ define <2 x float> @test18(<2 x i32> %tr
; AVX512F: ## %bb.0:
; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1
; AVX512F-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]
-; AVX512F-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
-; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
-; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm0
+; AVX512F-NEXT: vpcmpeqq %zmm1, %zmm0, %k0
+; AVX512F-NEXT: kshiftlw $14, %k0, %k0
+; AVX512F-NEXT: kshiftrw $14, %k0, %k1
+; AVX512F-NEXT: vmovups (%rdi), %zmm0 {%k1} {z}
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test18:
@@ -729,8 +767,11 @@ define <4 x float> @load_all(<4 x i32> %
;
; AVX512F-LABEL: load_all:
; AVX512F: ## %bb.0:
-; AVX512F-NEXT: vpcmpeqd %xmm0, %xmm0, %xmm0
-; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm0
+; AVX512F-NEXT: movw $15, %ax
+; AVX512F-NEXT: kmovw %eax, %k1
+; AVX512F-NEXT: vmovups (%rdi), %zmm0 {%k1} {z}
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: load_all:
@@ -755,9 +796,12 @@ define <4 x float> @mload_constmask_v4f3
;
; AVX512F-LABEL: mload_constmask_v4f32:
; AVX512F: ## %bb.0:
-; AVX512F-NEXT: vmovaps {{.*#+}} xmm1 = [4294967295,0,4294967295,4294967295]
-; AVX512F-NEXT: vmaskmovps (%rdi), %xmm1, %xmm2
-; AVX512F-NEXT: vblendvps %xmm1, %xmm2, %xmm0, %xmm0
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512F-NEXT: movw $13, %ax
+; AVX512F-NEXT: kmovw %eax, %k1
+; AVX512F-NEXT: vmovups (%rdi), %zmm0 {%k1}
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: mload_constmask_v4f32:
@@ -789,9 +833,12 @@ define <4 x i32> @mload_constmask_v4i32(
;
; AVX512F-LABEL: mload_constmask_v4i32:
; AVX512F: ## %bb.0:
-; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [0,4294967295,4294967295,4294967295]
-; AVX512F-NEXT: vpmaskmovd (%rdi), %xmm1, %xmm2
-; AVX512F-NEXT: vblendvps %xmm1, %xmm2, %xmm0, %xmm0
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512F-NEXT: movw $14, %ax
+; AVX512F-NEXT: kmovw %eax, %k1
+; AVX512F-NEXT: vmovdqu32 (%rdi), %zmm0 {%k1}
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: mload_constmask_v4i32:
@@ -843,9 +890,11 @@ define <4 x double> @mload_constmask_v4f
;
; AVX512F-LABEL: mload_constmask_v4f64:
; AVX512F: ## %bb.0:
-; AVX512F-NEXT: vmovapd {{.*#+}} ymm1 = [18446744073709551615,18446744073709551615,18446744073709551615,0]
-; AVX512F-NEXT: vmaskmovpd (%rdi), %ymm1, %ymm2
-; AVX512F-NEXT: vblendvpd %ymm1, %ymm2, %ymm0, %ymm0
+; AVX512F-NEXT: ## kill: def %ymm0 killed %ymm0 def %zmm0
+; AVX512F-NEXT: movb $7, %al
+; AVX512F-NEXT: kmovw %eax, %k1
+; AVX512F-NEXT: vmovupd (%rdi), %zmm0 {%k1}
+; AVX512F-NEXT: ## kill: def %ymm0 killed %ymm0 killed %zmm0
; AVX512F-NEXT: retq
;
; SKX-LABEL: mload_constmask_v4f64:
@@ -898,9 +947,11 @@ define <4 x i64> @mload_constmask_v4i64(
;
; AVX512F-LABEL: mload_constmask_v4i64:
; AVX512F: ## %bb.0:
-; AVX512F-NEXT: vmovdqa {{.*#+}} ymm1 = [18446744073709551615,0,0,18446744073709551615]
-; AVX512F-NEXT: vpmaskmovq (%rdi), %ymm1, %ymm2
-; AVX512F-NEXT: vblendvpd %ymm1, %ymm2, %ymm0, %ymm0
+; AVX512F-NEXT: ## kill: def %ymm0 killed %ymm0 def %zmm0
+; AVX512F-NEXT: movb $9, %al
+; AVX512F-NEXT: kmovw %eax, %k1
+; AVX512F-NEXT: vmovdqu64 (%rdi), %zmm0 {%k1}
+; AVX512F-NEXT: ## kill: def %ymm0 killed %ymm0 killed %zmm0
; AVX512F-NEXT: retq
;
; SKX-LABEL: mload_constmask_v4i64:
@@ -950,8 +1001,10 @@ define <4 x double> @mload_constmask_v4f
;
; AVX512F-LABEL: mload_constmask_v4f64_undef_passthrough:
; AVX512F: ## %bb.0:
-; AVX512F-NEXT: vmovapd {{.*#+}} ymm0 = [18446744073709551615,18446744073709551615,18446744073709551615,0]
-; AVX512F-NEXT: vmaskmovpd (%rdi), %ymm0, %ymm0
+; AVX512F-NEXT: movb $7, %al
+; AVX512F-NEXT: kmovw %eax, %k1
+; AVX512F-NEXT: vmovupd (%rdi), %zmm0 {%k1} {z}
+; AVX512F-NEXT: ## kill: def %ymm0 killed %ymm0 killed %zmm0
; AVX512F-NEXT: retq
;
; SKX-LABEL: mload_constmask_v4f64_undef_passthrough:
@@ -979,8 +1032,10 @@ define <4 x i64> @mload_constmask_v4i64_
;
; AVX512F-LABEL: mload_constmask_v4i64_undef_passthrough:
; AVX512F: ## %bb.0:
-; AVX512F-NEXT: vmovdqa {{.*#+}} ymm0 = [0,18446744073709551615,18446744073709551615,0]
-; AVX512F-NEXT: vpmaskmovq (%rdi), %ymm0, %ymm0
+; AVX512F-NEXT: movb $6, %al
+; AVX512F-NEXT: kmovw %eax, %k1
+; AVX512F-NEXT: vmovdqu64 (%rdi), %zmm0 {%k1} {z}
+; AVX512F-NEXT: ## kill: def %ymm0 killed %ymm0 killed %zmm0
; AVX512F-NEXT: retq
;
; SKX-LABEL: mload_constmask_v4i64_undef_passthrough:
@@ -1008,8 +1063,11 @@ define void @test21(<4 x i32> %trigger,
;
; AVX512F-LABEL: test21:
; AVX512F: ## %bb.0:
-; AVX512F-NEXT: vpcmpeqd %xmm0, %xmm0, %xmm0
-; AVX512F-NEXT: vpmaskmovd %xmm1, %xmm0, (%rdi)
+; AVX512F-NEXT: ## kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: movw $15, %ax
+; AVX512F-NEXT: kmovw %eax, %k1
+; AVX512F-NEXT: vmovdqu32 %zmm1, (%rdi) {%k1}
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: test21:
@@ -1225,7 +1283,14 @@ define void @trunc_mask(<4 x float> %x,
;
; AVX512F-LABEL: trunc_mask:
; AVX512F: ## %bb.0:
-; AVX512F-NEXT: vmaskmovps %xmm0, %xmm2, (%rdi)
+; AVX512F-NEXT: ## kill: def %xmm2 killed %xmm2 def %zmm2
+; AVX512F-NEXT: ## kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1
+; AVX512F-NEXT: vpcmpgtd %zmm2, %zmm1, %k0
+; AVX512F-NEXT: kshiftlw $12, %k0, %k0
+; AVX512F-NEXT: kshiftrw $12, %k0, %k1
+; AVX512F-NEXT: vmovups %zmm0, (%rdi) {%k1}
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; SKX-LABEL: trunc_mask:
Modified: llvm/trunk/test/CodeGen/X86/pr33349.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pr33349.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/pr33349.ll (original)
+++ llvm/trunk/test/CodeGen/X86/pr33349.ll Sun Jan 7 10:20:37 2018
@@ -8,32 +8,38 @@ target triple = "x86_64-unknown-linux-gn
define void @test(<4 x i1> %m, <4 x x86_fp80> %v, <4 x x86_fp80>*%p) local_unnamed_addr {
; KNL-LABEL: test:
; KNL: # %bb.0: # %bb
-; KNL-NEXT: vpextrb $0, %xmm0, %eax
+; KNL-NEXT: vpslld $31, %xmm0, %xmm0
+; KNL-NEXT: vptestmd %zmm0, %zmm0, %k0
+; KNL-NEXT: kshiftrw $1, %k0, %k1
+; KNL-NEXT: kmovw %k1, %eax
+; KNL-NEXT: kshiftrw $2, %k0, %k1
+; KNL-NEXT: kshiftrw $1, %k1, %k2
+; KNL-NEXT: kmovw %k1, %ecx
; KNL-NEXT: testb $1, %al
; KNL-NEXT: fld1
; KNL-NEXT: fldz
; KNL-NEXT: fld %st(0)
; KNL-NEXT: fcmovne %st(2), %st(0)
-; KNL-NEXT: vpextrb $4, %xmm0, %eax
-; KNL-NEXT: testb $1, %al
+; KNL-NEXT: testb $1, %cl
; KNL-NEXT: fld %st(1)
; KNL-NEXT: fcmovne %st(3), %st(0)
-; KNL-NEXT: vpextrb $8, %xmm0, %eax
+; KNL-NEXT: kmovw %k2, %eax
; KNL-NEXT: testb $1, %al
; KNL-NEXT: fld %st(2)
; KNL-NEXT: fcmovne %st(4), %st(0)
-; KNL-NEXT: vpextrb $12, %xmm0, %eax
+; KNL-NEXT: kmovw %k0, %eax
; KNL-NEXT: testb $1, %al
; KNL-NEXT: fxch %st(3)
; KNL-NEXT: fcmovne %st(4), %st(0)
; KNL-NEXT: fstp %st(4)
; KNL-NEXT: fxch %st(3)
+; KNL-NEXT: fstpt (%rdi)
+; KNL-NEXT: fxch %st(1)
; KNL-NEXT: fstpt 30(%rdi)
; KNL-NEXT: fxch %st(1)
; KNL-NEXT: fstpt 20(%rdi)
-; KNL-NEXT: fxch %st(1)
; KNL-NEXT: fstpt 10(%rdi)
-; KNL-NEXT: fstpt (%rdi)
+; KNL-NEXT: vzeroupper
; KNL-NEXT: retq
;
; SKX-LABEL: test:
Modified: llvm/trunk/test/CodeGen/X86/sse-fsignum.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse-fsignum.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/sse-fsignum.ll (original)
+++ llvm/trunk/test/CodeGen/X86/sse-fsignum.ll Sun Jan 7 10:20:37 2018
@@ -10,17 +10,44 @@
;
define void @signum32a(<4 x float>*) {
-; AVX-LABEL: signum32a:
-; AVX: # %bb.0: # %entry
-; AVX-NEXT: vmovaps (%rdi), %xmm0
-; AVX-NEXT: vxorps %xmm1, %xmm1, %xmm1
-; AVX-NEXT: vcmpltps %xmm1, %xmm0, %xmm2
-; AVX-NEXT: vcvtdq2ps %xmm2, %xmm2
-; AVX-NEXT: vcmpltps %xmm0, %xmm1, %xmm0
-; AVX-NEXT: vcvtdq2ps %xmm0, %xmm0
-; AVX-NEXT: vsubps %xmm0, %xmm2, %xmm0
-; AVX-NEXT: vmovaps %xmm0, (%rdi)
-; AVX-NEXT: retq
+; AVX1-LABEL: signum32a:
+; AVX1: # %bb.0: # %entry
+; AVX1-NEXT: vmovaps (%rdi), %xmm0
+; AVX1-NEXT: vxorps %xmm1, %xmm1, %xmm1
+; AVX1-NEXT: vcmpltps %xmm1, %xmm0, %xmm2
+; AVX1-NEXT: vcvtdq2ps %xmm2, %xmm2
+; AVX1-NEXT: vcmpltps %xmm0, %xmm1, %xmm0
+; AVX1-NEXT: vcvtdq2ps %xmm0, %xmm0
+; AVX1-NEXT: vsubps %xmm0, %xmm2, %xmm0
+; AVX1-NEXT: vmovaps %xmm0, (%rdi)
+; AVX1-NEXT: retq
+;
+; AVX2-LABEL: signum32a:
+; AVX2: # %bb.0: # %entry
+; AVX2-NEXT: vmovaps (%rdi), %xmm0
+; AVX2-NEXT: vxorps %xmm1, %xmm1, %xmm1
+; AVX2-NEXT: vcmpltps %xmm1, %xmm0, %xmm2
+; AVX2-NEXT: vcvtdq2ps %xmm2, %xmm2
+; AVX2-NEXT: vcmpltps %xmm0, %xmm1, %xmm0
+; AVX2-NEXT: vcvtdq2ps %xmm0, %xmm0
+; AVX2-NEXT: vsubps %xmm0, %xmm2, %xmm0
+; AVX2-NEXT: vmovaps %xmm0, (%rdi)
+; AVX2-NEXT: retq
+;
+; AVX512F-LABEL: signum32a:
+; AVX512F: # %bb.0: # %entry
+; AVX512F-NEXT: vmovaps (%rdi), %xmm0
+; AVX512F-NEXT: vxorps %xmm1, %xmm1, %xmm1
+; AVX512F-NEXT: vcmpltps %zmm1, %zmm0, %k1
+; AVX512F-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
+; AVX512F-NEXT: vcvtdq2ps %xmm2, %xmm2
+; AVX512F-NEXT: vcmpltps %zmm0, %zmm1, %k1
+; AVX512F-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512F-NEXT: vcvtdq2ps %xmm0, %xmm0
+; AVX512F-NEXT: vsubps %xmm0, %xmm2, %xmm0
+; AVX512F-NEXT: vmovaps %xmm0, (%rdi)
+; AVX512F-NEXT: vzeroupper
+; AVX512F-NEXT: retq
entry:
%1 = load <4 x float>, <4 x float>* %0
%2 = fcmp olt <4 x float> %1, zeroinitializer
@@ -33,19 +60,48 @@ entry:
}
define void @signum64a(<2 x double>*) {
-; AVX-LABEL: signum64a:
-; AVX: # %bb.0: # %entry
-; AVX-NEXT: vmovapd (%rdi), %xmm0
-; AVX-NEXT: vxorpd %xmm1, %xmm1, %xmm1
-; AVX-NEXT: vcmpltpd %xmm1, %xmm0, %xmm2
-; AVX-NEXT: vpermilps {{.*#+}} xmm2 = xmm2[0,2,2,3]
-; AVX-NEXT: vcvtdq2pd %xmm2, %xmm2
-; AVX-NEXT: vcmpltpd %xmm0, %xmm1, %xmm0
-; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3]
-; AVX-NEXT: vcvtdq2pd %xmm0, %xmm0
-; AVX-NEXT: vsubpd %xmm0, %xmm2, %xmm0
-; AVX-NEXT: vmovapd %xmm0, (%rdi)
-; AVX-NEXT: retq
+; AVX1-LABEL: signum64a:
+; AVX1: # %bb.0: # %entry
+; AVX1-NEXT: vmovapd (%rdi), %xmm0
+; AVX1-NEXT: vxorpd %xmm1, %xmm1, %xmm1
+; AVX1-NEXT: vcmpltpd %xmm1, %xmm0, %xmm2
+; AVX1-NEXT: vpermilps {{.*#+}} xmm2 = xmm2[0,2,2,3]
+; AVX1-NEXT: vcvtdq2pd %xmm2, %xmm2
+; AVX1-NEXT: vcmpltpd %xmm0, %xmm1, %xmm0
+; AVX1-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3]
+; AVX1-NEXT: vcvtdq2pd %xmm0, %xmm0
+; AVX1-NEXT: vsubpd %xmm0, %xmm2, %xmm0
+; AVX1-NEXT: vmovapd %xmm0, (%rdi)
+; AVX1-NEXT: retq
+;
+; AVX2-LABEL: signum64a:
+; AVX2: # %bb.0: # %entry
+; AVX2-NEXT: vmovapd (%rdi), %xmm0
+; AVX2-NEXT: vxorpd %xmm1, %xmm1, %xmm1
+; AVX2-NEXT: vcmpltpd %xmm1, %xmm0, %xmm2
+; AVX2-NEXT: vpermilps {{.*#+}} xmm2 = xmm2[0,2,2,3]
+; AVX2-NEXT: vcvtdq2pd %xmm2, %xmm2
+; AVX2-NEXT: vcmpltpd %xmm0, %xmm1, %xmm0
+; AVX2-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3]
+; AVX2-NEXT: vcvtdq2pd %xmm0, %xmm0
+; AVX2-NEXT: vsubpd %xmm0, %xmm2, %xmm0
+; AVX2-NEXT: vmovapd %xmm0, (%rdi)
+; AVX2-NEXT: retq
+;
+; AVX512F-LABEL: signum64a:
+; AVX512F: # %bb.0: # %entry
+; AVX512F-NEXT: vmovapd (%rdi), %xmm0
+; AVX512F-NEXT: vxorpd %xmm1, %xmm1, %xmm1
+; AVX512F-NEXT: vcmpltpd %zmm1, %zmm0, %k1
+; AVX512F-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
+; AVX512F-NEXT: vcvtdq2pd %xmm2, %xmm2
+; AVX512F-NEXT: vcmpltpd %zmm0, %zmm1, %k1
+; AVX512F-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512F-NEXT: vcvtdq2pd %xmm0, %xmm0
+; AVX512F-NEXT: vsubpd %xmm0, %xmm2, %xmm0
+; AVX512F-NEXT: vmovapd %xmm0, (%rdi)
+; AVX512F-NEXT: vzeroupper
+; AVX512F-NEXT: retq
entry:
%1 = load <2 x double>, <2 x double>* %0
%2 = fcmp olt <2 x double> %1, zeroinitializer
@@ -152,11 +208,11 @@ define void @signum64b(<4 x double>*) {
; AVX512F: # %bb.0: # %entry
; AVX512F-NEXT: vmovapd (%rdi), %ymm0
; AVX512F-NEXT: vxorpd %xmm1, %xmm1, %xmm1
-; AVX512F-NEXT: vcmpltpd %ymm1, %ymm0, %ymm2
-; AVX512F-NEXT: vpmovqd %zmm2, %ymm2
+; AVX512F-NEXT: vcmpltpd %zmm1, %zmm0, %k1
+; AVX512F-NEXT: vpternlogd $255, %zmm2, %zmm2, %zmm2 {%k1} {z}
; AVX512F-NEXT: vcvtdq2pd %xmm2, %ymm2
-; AVX512F-NEXT: vcmpltpd %ymm0, %ymm1, %ymm0
-; AVX512F-NEXT: vpmovqd %zmm0, %ymm0
+; AVX512F-NEXT: vcmpltpd %zmm0, %zmm1, %k1
+; AVX512F-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
; AVX512F-NEXT: vcvtdq2pd %xmm0, %ymm0
; AVX512F-NEXT: vsubpd %ymm0, %ymm2, %ymm0
; AVX512F-NEXT: vmovapd %ymm0, (%rdi)
Modified: llvm/trunk/test/CodeGen/X86/vector-shuffle-v1.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-shuffle-v1.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/vector-shuffle-v1.ll (original)
+++ llvm/trunk/test/CodeGen/X86/vector-shuffle-v1.ll Sun Jan 7 10:20:37 2018
@@ -6,7 +6,14 @@
define <2 x i1> @shuf2i1_1_0(<2 x i1> %a) {
; AVX512F-LABEL: shuf2i1_1_0:
; AVX512F: # %bb.0:
-; AVX512F-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[2,3,0,1]
+; AVX512F-NEXT: vpsllq $63, %xmm0, %xmm0
+; AVX512F-NEXT: vptestmq %zmm0, %zmm0, %k1
+; AVX512F-NEXT: vpternlogq $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,3,0,1]
+; AVX512F-NEXT: vptestmq %zmm0, %zmm0, %k1
+; AVX512F-NEXT: vpternlogq $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: shuf2i1_1_0:
@@ -36,9 +43,16 @@ define <2 x i1> @shuf2i1_1_0(<2 x i1> %a
define <2 x i1> @shuf2i1_1_2(<2 x i1> %a) {
; AVX512F-LABEL: shuf2i1_1_2:
; AVX512F: # %bb.0:
-; AVX512F-NEXT: movl $1, %eax
+; AVX512F-NEXT: vpsllq $63, %xmm0, %xmm0
+; AVX512F-NEXT: vptestmq %zmm0, %zmm0, %k1
+; AVX512F-NEXT: vpternlogq $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512F-NEXT: movq $-1, %rax
; AVX512F-NEXT: vmovq %rax, %xmm1
; AVX512F-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[8,9,10,11,12,13,14,15],xmm1[0,1,2,3,4,5,6,7]
+; AVX512F-NEXT: vptestmq %zmm0, %zmm0, %k1
+; AVX512F-NEXT: vpternlogq $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: shuf2i1_1_2:
@@ -73,7 +87,14 @@ define <2 x i1> @shuf2i1_1_2(<2 x i1> %a
define <4 x i1> @shuf4i1_3_2_10(<4 x i1> %a) {
; AVX512F-LABEL: shuf4i1_3_2_10:
; AVX512F: # %bb.0:
-; AVX512F-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,2,1,0]
+; AVX512F-NEXT: vpslld $31, %xmm0, %xmm0
+; AVX512F-NEXT: vptestmd %zmm0, %zmm0, %k1
+; AVX512F-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[3,2,1,0]
+; AVX512F-NEXT: vptestmd %zmm0, %zmm0, %k1
+; AVX512F-NEXT: vpternlogd $255, %zmm0, %zmm0, %zmm0 {%k1} {z}
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: shuf4i1_3_2_10:
Modified: llvm/trunk/test/CodeGen/X86/vselect-pcmp.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vselect-pcmp.ll?rev=321967&r1=321966&r2=321967&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/vselect-pcmp.ll (original)
+++ llvm/trunk/test/CodeGen/X86/vselect-pcmp.ll Sun Jan 7 10:20:37 2018
@@ -43,10 +43,22 @@ define <8 x i16> @signbit_sel_v8i16(<8 x
}
define <4 x i32> @signbit_sel_v4i32(<4 x i32> %x, <4 x i32> %y, <4 x i32> %mask) {
-; AVX12F-LABEL: signbit_sel_v4i32:
-; AVX12F: # %bb.0:
-; AVX12F-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
-; AVX12F-NEXT: retq
+; AVX12-LABEL: signbit_sel_v4i32:
+; AVX12: # %bb.0:
+; AVX12-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; AVX12-NEXT: retq
+;
+; AVX512F-LABEL: signbit_sel_v4i32:
+; AVX512F: # %bb.0:
+; AVX512F-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; AVX512F-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512F-NEXT: vpxor %xmm3, %xmm3, %xmm3
+; AVX512F-NEXT: vpcmpgtd %zmm2, %zmm3, %k1
+; AVX512F-NEXT: vpblendmd %zmm0, %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
+; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: signbit_sel_v4i32:
; AVX512VL: # %bb.0:
@@ -60,10 +72,22 @@ define <4 x i32> @signbit_sel_v4i32(<4 x
}
define <2 x i64> @signbit_sel_v2i64(<2 x i64> %x, <2 x i64> %y, <2 x i64> %mask) {
-; AVX12F-LABEL: signbit_sel_v2i64:
-; AVX12F: # %bb.0:
-; AVX12F-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
-; AVX12F-NEXT: retq
+; AVX12-LABEL: signbit_sel_v2i64:
+; AVX12: # %bb.0:
+; AVX12-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
+; AVX12-NEXT: retq
+;
+; AVX512F-LABEL: signbit_sel_v2i64:
+; AVX512F: # %bb.0:
+; AVX512F-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; AVX512F-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512F-NEXT: vpxor %xmm3, %xmm3, %xmm3
+; AVX512F-NEXT: vpcmpgtq %zmm2, %zmm3, %k1
+; AVX512F-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
+; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: signbit_sel_v2i64:
; AVX512VL: # %bb.0:
@@ -77,10 +101,22 @@ define <2 x i64> @signbit_sel_v2i64(<2 x
}
define <4 x float> @signbit_sel_v4f32(<4 x float> %x, <4 x float> %y, <4 x i32> %mask) {
-; AVX12F-LABEL: signbit_sel_v4f32:
-; AVX12F: # %bb.0:
-; AVX12F-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
-; AVX12F-NEXT: retq
+; AVX12-LABEL: signbit_sel_v4f32:
+; AVX12: # %bb.0:
+; AVX12-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; AVX12-NEXT: retq
+;
+; AVX512F-LABEL: signbit_sel_v4f32:
+; AVX512F: # %bb.0:
+; AVX512F-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; AVX512F-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512F-NEXT: vpxor %xmm3, %xmm3, %xmm3
+; AVX512F-NEXT: vpcmpgtd %zmm2, %zmm3, %k1
+; AVX512F-NEXT: vblendmps %zmm0, %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
+; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: signbit_sel_v4f32:
; AVX512VL: # %bb.0:
@@ -94,10 +130,22 @@ define <4 x float> @signbit_sel_v4f32(<4
}
define <2 x double> @signbit_sel_v2f64(<2 x double> %x, <2 x double> %y, <2 x i64> %mask) {
-; AVX12F-LABEL: signbit_sel_v2f64:
-; AVX12F: # %bb.0:
-; AVX12F-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
-; AVX12F-NEXT: retq
+; AVX12-LABEL: signbit_sel_v2f64:
+; AVX12: # %bb.0:
+; AVX12-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0
+; AVX12-NEXT: retq
+;
+; AVX512F-LABEL: signbit_sel_v2f64:
+; AVX512F: # %bb.0:
+; AVX512F-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; AVX512F-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512F-NEXT: vpxor %xmm3, %xmm3, %xmm3
+; AVX512F-NEXT: vpcmpgtq %zmm2, %zmm3, %k1
+; AVX512F-NEXT: vblendmpd %zmm0, %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
+; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: signbit_sel_v2f64:
; AVX512VL: # %bb.0:
@@ -203,10 +251,21 @@ define <8 x i32> @signbit_sel_v8i32(<8 x
}
define <4 x i64> @signbit_sel_v4i64(<4 x i64> %x, <4 x i64> %y, <4 x i64> %mask) {
-; AVX12F-LABEL: signbit_sel_v4i64:
-; AVX12F: # %bb.0:
-; AVX12F-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
-; AVX12F-NEXT: retq
+; AVX12-LABEL: signbit_sel_v4i64:
+; AVX12: # %bb.0:
+; AVX12-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; AVX12-NEXT: retq
+;
+; AVX512F-LABEL: signbit_sel_v4i64:
+; AVX512F: # %bb.0:
+; AVX512F-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
+; AVX512F-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; AVX512F-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; AVX512F-NEXT: vpxor %xmm3, %xmm3, %xmm3
+; AVX512F-NEXT: vpcmpgtq %zmm2, %zmm3, %k1
+; AVX512F-NEXT: vpblendmq %zmm0, %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
+; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: signbit_sel_v4i64:
; AVX512VL: # %bb.0:
@@ -220,10 +279,21 @@ define <4 x i64> @signbit_sel_v4i64(<4 x
}
define <4 x double> @signbit_sel_v4f64(<4 x double> %x, <4 x double> %y, <4 x i64> %mask) {
-; AVX12F-LABEL: signbit_sel_v4f64:
-; AVX12F: # %bb.0:
-; AVX12F-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
-; AVX12F-NEXT: retq
+; AVX12-LABEL: signbit_sel_v4f64:
+; AVX12: # %bb.0:
+; AVX12-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; AVX12-NEXT: retq
+;
+; AVX512F-LABEL: signbit_sel_v4f64:
+; AVX512F: # %bb.0:
+; AVX512F-NEXT: # kill: def %ymm2 killed %ymm2 def %zmm2
+; AVX512F-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; AVX512F-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; AVX512F-NEXT: vpxor %xmm3, %xmm3, %xmm3
+; AVX512F-NEXT: vpcmpgtq %zmm2, %zmm3, %k1
+; AVX512F-NEXT: vblendmpd %zmm0, %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
+; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: signbit_sel_v4f64:
; AVX512VL: # %bb.0:
@@ -256,8 +326,13 @@ define <4 x double> @signbit_sel_v4f64_s
;
; AVX512F-LABEL: signbit_sel_v4f64_small_mask:
; AVX512F: # %bb.0:
-; AVX512F-NEXT: vpmovsxdq %xmm2, %ymm2
-; AVX512F-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0
+; AVX512F-NEXT: # kill: def %xmm2 killed %xmm2 def %zmm2
+; AVX512F-NEXT: # kill: def %ymm1 killed %ymm1 def %zmm1
+; AVX512F-NEXT: # kill: def %ymm0 killed %ymm0 def %zmm0
+; AVX512F-NEXT: vpxor %xmm3, %xmm3, %xmm3
+; AVX512F-NEXT: vpcmpgtd %zmm2, %zmm3, %k1
+; AVX512F-NEXT: vblendmpd %zmm0, %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: # kill: def %ymm0 killed %ymm0 killed %zmm0
; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: signbit_sel_v4f64_small_mask:
@@ -296,12 +371,23 @@ define <8 x double> @signbit_sel_v8f64(<
; (2) FIXME: If we don't care about signed-zero (and NaN?), the compare should still get folded.
define <4 x float> @signbit_sel_v4f32_fcmp(<4 x float> %x, <4 x float> %y, <4 x float> %mask) #0 {
-; AVX12F-LABEL: signbit_sel_v4f32_fcmp:
-; AVX12F: # %bb.0:
-; AVX12F-NEXT: vxorps %xmm2, %xmm2, %xmm2
-; AVX12F-NEXT: vcmpltps %xmm2, %xmm0, %xmm2
-; AVX12F-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
-; AVX12F-NEXT: retq
+; AVX12-LABEL: signbit_sel_v4f32_fcmp:
+; AVX12: # %bb.0:
+; AVX12-NEXT: vxorps %xmm2, %xmm2, %xmm2
+; AVX12-NEXT: vcmpltps %xmm2, %xmm0, %xmm2
+; AVX12-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; AVX12-NEXT: retq
+;
+; AVX512F-LABEL: signbit_sel_v4f32_fcmp:
+; AVX512F: # %bb.0:
+; AVX512F-NEXT: # kill: def %xmm1 killed %xmm1 def %zmm1
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 def %zmm0
+; AVX512F-NEXT: vxorps %xmm2, %xmm2, %xmm2
+; AVX512F-NEXT: vcmpltps %zmm2, %zmm0, %k1
+; AVX512F-NEXT: vblendmps %zmm0, %zmm1, %zmm0 {%k1}
+; AVX512F-NEXT: # kill: def %xmm0 killed %xmm0 killed %zmm0
+; AVX512F-NEXT: vzeroupper
+; AVX512F-NEXT: retq
;
; AVX512VL-LABEL: signbit_sel_v4f32_fcmp:
; AVX512VL: # %bb.0:
More information about the llvm-commits
mailing list