<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Simon,<div class=""><br class=""></div><div class="">I think this commit is causing timeouts on our LTO builds on green dragon:</div><div class=""><a href="http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/10676/" class="">http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/10676/</a></div><div class=""><br class=""></div><div class="">A build usually takes 45min and we kill them if they last more than 90min.</div><div class=""><br class=""></div><div class="">—Juergen</div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Oct 27, 2016, at 7:29 AM, Simon Pilgrim via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org" class="">llvm-commits@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">Author: rksimon<br class="">Date: Thu Oct 27 09:29:28 2016<br class="">New Revision: 285296<br class=""><br class="">URL: <a href="http://llvm.org/viewvc/llvm-project?rev=285296&view=rev" class="">http://llvm.org/viewvc/llvm-project?rev=285296&view=rev</a><br class="">Log:<br class="">[DAGCombiner] Add vector demanded elements support to computeKnownBits<br class=""><br class="">Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements.<br class=""><br class="">This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.<br class=""><br class="">The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used.<br class=""><br class="">I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course.<br class=""><br class="">DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit.<br class=""><br class="">Differential Revision: <a href="https://reviews.llvm.org/D25691" class="">https://reviews.llvm.org/D25691</a><br class=""><br class="">Modified:<br class=""> llvm/trunk/include/llvm/CodeGen/SelectionDAG.h<br class=""> llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp<br class=""> llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll<br class=""> llvm/trunk/test/CodeGen/X86/known-bits-vector.ll<br class=""><br class="">Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAG.h<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAG.h?rev=285296&r1=285295&r2=285296&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAG.h?rev=285296&r1=285295&r2=285296&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/include/llvm/CodeGen/SelectionDAG.h (original)<br class="">+++ llvm/trunk/include/llvm/CodeGen/SelectionDAG.h Thu Oct 27 09:29:28 2016<br class="">@@ -1256,12 +1256,22 @@ public:<br class=""> const;<br class=""><br class=""> /// Determine which bits of Op are known to be either zero or one and return<br class="">- /// them in the KnownZero/KnownOne bitsets. Targets can implement the<br class="">- /// computeKnownBitsForTargetNode method in the TargetLowering class to allow<br class="">- /// target nodes to be understood.<br class="">+ /// them in the KnownZero/KnownOne bitsets. For vectors, the known bits are<br class="">+ /// those that are shared by every vector element.<br class="">+ /// Targets can implement the computeKnownBitsForTargetNode method in the<br class="">+ /// TargetLowering class to allow target nodes to be understood.<br class=""> void computeKnownBits(SDValue Op, APInt &KnownZero, APInt &KnownOne,<br class=""> unsigned Depth = 0) const;<br class=""><br class="">+ /// Determine which bits of Op are known to be either zero or one and return<br class="">+ /// them in the KnownZero/KnownOne bitsets. The DemandedElts argument allows<br class="">+ /// us to only collect the known bits that are shared by the requested vector<br class="">+ /// elements.<br class="">+ /// Targets can implement the computeKnownBitsForTargetNode method in the<br class="">+ /// TargetLowering class to allow target nodes to be understood.<br class="">+ void computeKnownBits(SDValue Op, APInt &KnownZero, APInt &KnownOne,<br class="">+ const APInt &DemandedElts, unsigned Depth = 0) const;<br class="">+<br class=""> /// Test if the given value is known to have exactly one bit set. This differs<br class=""> /// from computeKnownBits in that it doesn't necessarily determine which bit<br class=""> /// is set.<br class=""><br class="">Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=285296&r1=285295&r2=285296&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=285296&r1=285295&r2=285296&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original)<br class="">+++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Thu Oct 27 09:29:28 2016<br class="">@@ -2007,9 +2007,26 @@ static const APInt *getValidShiftAmountC<br class=""> }<br class=""><br class=""> /// Determine which bits of Op are known to be either zero or one and return<br class="">-/// them in the KnownZero/KnownOne bitsets.<br class="">+/// them in the KnownZero/KnownOne bitsets. For vectors, the known bits are<br class="">+/// those that are shared by every vector element.<br class=""> void SelectionDAG::computeKnownBits(SDValue Op, APInt &KnownZero,<br class=""> APInt &KnownOne, unsigned Depth) const {<br class="">+ EVT VT = Op.getValueType();<br class="">+ APInt DemandedElts = VT.isVector()<br class="">+ ? APInt::getAllOnesValue(VT.getVectorNumElements())<br class="">+ : APInt(1, 1);<br class="">+ computeKnownBits(Op, KnownZero, KnownOne, DemandedElts, Depth);<br class="">+}<br class="">+<br class="">+/// Determine which bits of Op are known to be either zero or one and return<br class="">+/// them in the KnownZero/KnownOne bitsets. The DemandedElts argument allows<br class="">+/// us to only collect the known bits that are shared by the requested vector<br class="">+/// elements.<br class="">+/// TODO: We only support DemandedElts on a few opcodes so far, the remainder<br class="">+/// should be added when they become necessary.<br class="">+void SelectionDAG::computeKnownBits(SDValue Op, APInt &KnownZero,<br class="">+ APInt &KnownOne, const APInt &DemandedElts,<br class="">+ unsigned Depth) const {<br class=""> unsigned BitWidth = Op.getScalarValueSizeInBits();<br class=""><br class=""> KnownZero = KnownOne = APInt(BitWidth, 0); // Don't know anything.<br class="">@@ -2017,6 +2034,10 @@ void SelectionDAG::computeKnownBits(SDVa<br class=""> return; // Limit search depth.<br class=""><br class=""> APInt KnownZero2, KnownOne2;<br class="">+ unsigned NumElts = DemandedElts.getBitWidth();<br class="">+<br class="">+ if (DemandedElts == APInt(NumElts, 0))<br class="">+ return; // No demanded elts, better to assume we don't know anything.<br class=""><br class=""> switch (Op.getOpcode()) {<br class=""> case ISD::Constant:<br class="">@@ -2025,9 +2046,15 @@ void SelectionDAG::computeKnownBits(SDVa<br class=""> KnownZero = ~KnownOne;<br class=""> break;<br class=""> case ISD::BUILD_VECTOR:<br class="">- // Collect the known bits that are shared by every vector element.<br class="">+ // Collect the known bits that are shared by every demanded vector element.<br class="">+ assert(NumElts == Op.getValueType().getVectorNumElements() &&<br class="">+ "Unexpected vector size");<br class=""> KnownZero = KnownOne = APInt::getAllOnesValue(BitWidth);<br class="">- for (SDValue SrcOp : Op->ops()) {<br class="">+ for (unsigned i = 0, e = Op.getNumOperands(); i != e; ++i) {<br class="">+ if (!DemandedElts[i])<br class="">+ continue;<br class="">+<br class="">+ SDValue SrcOp = Op.getOperand(i);<br class=""> computeKnownBits(SrcOp, KnownZero2, KnownOne2, Depth + 1);<br class=""><br class=""> // BUILD_VECTOR can implicitly truncate sources, we must handle this.<br class="">@@ -2038,16 +2065,72 @@ void SelectionDAG::computeKnownBits(SDVa<br class=""> KnownZero2 = KnownZero2.trunc(BitWidth);<br class=""> }<br class=""><br class="">- // Known bits are the values that are shared by every element.<br class="">- // TODO: support per-element known bits.<br class="">+ // Known bits are the values that are shared by every demanded element.<br class="">+ KnownOne &= KnownOne2;<br class="">+ KnownZero &= KnownZero2;<br class="">+ }<br class="">+ break;<br class="">+ case ISD::VECTOR_SHUFFLE: {<br class="">+ // Collect the known bits that are shared by every vector element referenced<br class="">+ // by the shuffle.<br class="">+ APInt DemandedLHS(NumElts, 0), DemandedRHS(NumElts, 0);<br class="">+ KnownZero = KnownOne = APInt::getAllOnesValue(BitWidth);<br class="">+ const ShuffleVectorSDNode *SVN = cast<ShuffleVectorSDNode>(Op);<br class="">+ assert(NumElts == SVN->getMask().size() && "Unexpected vector size");<br class="">+ for (unsigned i = 0; i != NumElts; ++i) {<br class="">+ int M = SVN->getMaskElt(i);<br class="">+ if (M < 0) {<br class="">+ // For UNDEF elements, we don't know anything about the common state of<br class="">+ // the shuffle result.<br class="">+ // FIXME: Is this too pessimistic?<br class="">+ KnownZero = KnownOne = APInt(BitWidth, 0);<br class="">+ break;<br class="">+ }<br class="">+ if (!DemandedElts[i])<br class="">+ continue;<br class="">+<br class="">+ if ((unsigned)M < NumElts)<br class="">+ DemandedLHS.setBit((unsigned)M % NumElts);<br class="">+ else<br class="">+ DemandedRHS.setBit((unsigned)M % NumElts);<br class="">+ }<br class="">+ // Known bits are the values that are shared by every demanded element.<br class="">+ if (DemandedLHS != APInt(NumElts, 0)) {<br class="">+ SDValue LHS = Op.getOperand(0);<br class="">+ computeKnownBits(LHS, KnownZero2, KnownOne2, DemandedLHS, Depth + 1);<br class="">+ KnownOne &= KnownOne2;<br class="">+ KnownZero &= KnownZero2;<br class="">+ }<br class="">+ if (DemandedRHS != APInt(NumElts, 0)) {<br class="">+ SDValue RHS = Op.getOperand(1);<br class="">+ computeKnownBits(RHS, KnownZero2, KnownOne2, DemandedRHS, Depth + 1);<br class=""> KnownOne &= KnownOne2;<br class=""> KnownZero &= KnownZero2;<br class=""> }<br class=""> break;<br class="">+ }<br class="">+ case ISD::EXTRACT_SUBVECTOR: {<br class="">+ // If we know the element index, just demand that subvector elements,<br class="">+ // otherwise demand them all.<br class="">+ SDValue Src = Op.getOperand(0);<br class="">+ ConstantSDNode *SubIdx = dyn_cast<ConstantSDNode>(Op.getOperand(1));<br class="">+ unsigned NumSrcElts = Src.getValueType().getVectorNumElements();<br class="">+ if (SubIdx && SubIdx->getAPIntValue().ule(NumSrcElts - NumElts)) {<br class="">+ // Offset the demanded elts by the subvector index.<br class="">+ uint64_t Idx = SubIdx->getZExtValue();<br class="">+ APInt DemandedSrc = DemandedElts.zext(NumSrcElts).shl(Idx);<br class="">+ computeKnownBits(Src, KnownZero, KnownOne, DemandedSrc, Depth + 1);<br class="">+ } else {<br class="">+ computeKnownBits(Src, KnownZero, KnownOne, Depth + 1);<br class="">+ }<br class="">+ break;<br class="">+ }<br class=""> case ISD::AND:<br class=""> // If either the LHS or the RHS are Zero, the result is zero.<br class="">- computeKnownBits(Op.getOperand(1), KnownZero, KnownOne, Depth+1);<br class="">- computeKnownBits(Op.getOperand(0), KnownZero2, KnownOne2, Depth+1);<br class="">+ computeKnownBits(Op.getOperand(1), KnownZero, KnownOne, DemandedElts,<br class="">+ Depth + 1);<br class="">+ computeKnownBits(Op.getOperand(0), KnownZero2, KnownOne2, DemandedElts,<br class="">+ Depth + 1);<br class=""><br class=""> // Output known-1 bits are only known if set in both the LHS & RHS.<br class=""> KnownOne &= KnownOne2;<br class="">@@ -2206,7 +2289,8 @@ void SelectionDAG::computeKnownBits(SDVa<br class=""> if (NewBits.getBoolValue())<br class=""> InputDemandedBits |= InSignBit;<br class=""><br class="">- computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, Depth+1);<br class="">+ computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, DemandedElts,<br class="">+ Depth + 1);<br class=""> KnownOne &= InputDemandedBits;<br class=""> KnownZero &= InputDemandedBits;<br class=""><br class="">@@ -2253,7 +2337,8 @@ void SelectionDAG::computeKnownBits(SDVa<br class=""> APInt NewBits = APInt::getHighBitsSet(BitWidth, BitWidth - InBits);<br class=""> KnownZero = KnownZero.trunc(InBits);<br class=""> KnownOne = KnownOne.trunc(InBits);<br class="">- computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, Depth+1);<br class="">+ computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, DemandedElts,<br class="">+ Depth + 1);<br class=""> KnownZero = KnownZero.zext(BitWidth);<br class=""> KnownOne = KnownOne.zext(BitWidth);<br class=""> KnownZero |= NewBits;<br class="">@@ -2266,7 +2351,8 @@ void SelectionDAG::computeKnownBits(SDVa<br class=""><br class=""> KnownZero = KnownZero.trunc(InBits);<br class=""> KnownOne = KnownOne.trunc(InBits);<br class="">- computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, Depth+1);<br class="">+ computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, DemandedElts,<br class="">+ Depth + 1);<br class=""><br class=""> // Note if the sign bit is known to be zero or one.<br class=""> bool SignBitKnownZero = KnownZero.isNegative();<br class="">@@ -2439,16 +2525,28 @@ void SelectionDAG::computeKnownBits(SDVa<br class=""> // At the moment we keep this simple and skip tracking the specific<br class=""> // element. This way we get the lowest common denominator for all elements<br class=""> // of the vector.<br class="">- // TODO: get information for given vector element<br class="">+ SDValue InVec = Op.getOperand(0);<br class="">+ SDValue EltNo = Op.getOperand(1);<br class="">+ EVT VecVT = InVec.getValueType();<br class=""> const unsigned BitWidth = Op.getValueSizeInBits();<br class="">- const unsigned EltBitWidth = Op.getOperand(0).getScalarValueSizeInBits();<br class="">+ const unsigned EltBitWidth = VecVT.getScalarSizeInBits();<br class="">+ const unsigned NumSrcElts = VecVT.getVectorNumElements();<br class=""> // If BitWidth > EltBitWidth the value is anyext:ed. So we do not know<br class=""> // anything about the extended bits.<br class=""> if (BitWidth > EltBitWidth) {<br class=""> KnownZero = KnownZero.trunc(EltBitWidth);<br class=""> KnownOne = KnownOne.trunc(EltBitWidth);<br class=""> }<br class="">- computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, Depth+1);<br class="">+ ConstantSDNode *ConstEltNo = dyn_cast<ConstantSDNode>(EltNo);<br class="">+ if (ConstEltNo && ConstEltNo->getAPIntValue().ult(NumSrcElts)) {<br class="">+ // If we know the element index, just demand that vector element.<br class="">+ unsigned Idx = ConstEltNo->getZExtValue();<br class="">+ APInt DemandedElt = APInt::getOneBitSet(NumSrcElts, Idx);<br class="">+ computeKnownBits(InVec, KnownZero, KnownOne, DemandedElt, Depth + 1);<br class="">+ } else {<br class="">+ // Unknown element index, so ignore DemandedElts and demand them all.<br class="">+ computeKnownBits(InVec, KnownZero, KnownOne, Depth + 1);<br class="">+ }<br class=""> if (BitWidth > EltBitWidth) {<br class=""> KnownZero = KnownZero.zext(BitWidth);<br class=""> KnownOne = KnownOne.zext(BitWidth);<br class=""><br class="">Modified: llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll?rev=285296&r1=285295&r2=285296&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll?rev=285296&r1=285295&r2=285296&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll (original)<br class="">+++ llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll Thu Oct 27 09:29:28 2016<br class="">@@ -465,11 +465,9 @@ define <4 x i64> @shuffle_v4i64_67zz(<4<br class=""> ; AVX1-LABEL: shuffle_v4i64_67zz:<br class=""> ; AVX1: ## BB#0:<br class=""> ; AVX1-NEXT: vperm2f128 {{.*#+}} ymm0 = ymm0[2,3],zero,zero<br class="">-; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2<br class="">-; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3<br class="">-; AVX1-NEXT: vpaddq %xmm2, %xmm3, %xmm2<br class=""> ; AVX1-NEXT: vpaddq %xmm0, %xmm1, %xmm0<br class="">-; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0<br class="">+; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm1<br class="">+; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0<br class=""> ; AVX1-NEXT: retq<br class=""> ;<br class=""> ; AVX2-LABEL: shuffle_v4i64_67zz:<br class=""><br class="">Modified: llvm/trunk/test/CodeGen/X86/known-bits-vector.ll<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/known-bits-vector.ll?rev=285296&r1=285295&r2=285296&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/known-bits-vector.ll?rev=285296&r1=285295&r2=285296&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/test/CodeGen/X86/known-bits-vector.ll (original)<br class="">+++ llvm/trunk/test/CodeGen/X86/known-bits-vector.ll Thu Oct 27 09:29:28 2016<br class="">@@ -5,16 +5,14 @@<br class=""> define i32 @knownbits_mask_extract_sext(<8 x i16> %a0) nounwind {<br class=""> ; X32-LABEL: knownbits_mask_extract_sext:<br class=""> ; X32: # BB#0:<br class="">-; X32-NEXT: vandps {{\.LCPI.*}}, %xmm0, %xmm0<br class="">-; X32-NEXT: vmovd %xmm0, %eax<br class="">-; X32-NEXT: cwtl<br class="">+; X32-NEXT: vpand {{\.LCPI.*}}, %xmm0, %xmm0<br class="">+; X32-NEXT: vpextrw $0, %xmm0, %eax<br class=""> ; X32-NEXT: retl<br class=""> ;<br class=""> ; X64-LABEL: knownbits_mask_extract_sext:<br class=""> ; X64: # BB#0:<br class="">-; X64-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0<br class="">-; X64-NEXT: vmovd %xmm0, %eax<br class="">-; X64-NEXT: cwtl<br class="">+; X64-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0<br class="">+; X64-NEXT: vpextrw $0, %xmm0, %eax<br class=""> ; X64-NEXT: retq<br class=""> %1 = and <8 x i16> %a0, <i16 15, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1><br class=""> %2 = extractelement <8 x i16> %1, i32 0<br class="">@@ -31,17 +29,10 @@ define float @knownbits_mask_extract_uit<br class=""> ; X32-NEXT: subl $16, %esp<br class=""> ; X32-NEXT: vpxor %xmm1, %xmm1, %xmm1<br class=""> ; X32-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1,2,3],xmm0[4,5,6,7]<br class="">-; X32-NEXT: vpextrd $1, %xmm0, %eax<br class=""> ; X32-NEXT: vmovq %xmm0, {{[0-9]+}}(%esp)<br class="">-; X32-NEXT: xorl %ecx, %ecx<br class="">-; X32-NEXT: testl %eax, %eax<br class="">-; X32-NEXT: setns %cl<br class=""> ; X32-NEXT: fildll {{[0-9]+}}(%esp)<br class="">-; X32-NEXT: fadds {{\.LCPI.*}}(,%ecx,4)<br class=""> ; X32-NEXT: fstps {{[0-9]+}}(%esp)<br class="">-; X32-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero<br class="">-; X32-NEXT: vmovss %xmm0, (%esp)<br class="">-; X32-NEXT: flds (%esp)<br class="">+; X32-NEXT: flds {{[0-9]+}}(%esp)<br class=""> ; X32-NEXT: movl %ebp, %esp<br class=""> ; X32-NEXT: popl %ebp<br class=""> ; X32-NEXT: retl<br class="">@@ -51,18 +42,7 @@ define float @knownbits_mask_extract_uit<br class=""> ; X64-NEXT: vpxor %xmm1, %xmm1, %xmm1<br class=""> ; X64-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1,2,3],xmm0[4,5,6,7]<br class=""> ; X64-NEXT: vmovq %xmm0, %rax<br class="">-; X64-NEXT: testq %rax, %rax<br class="">-; X64-NEXT: js .LBB1_1<br class="">-; X64-NEXT: # BB#2:<br class="">-; X64-NEXT: vcvtsi2ssq %rax, %xmm2, %xmm0<br class="">-; X64-NEXT: retq<br class="">-; X64-NEXT: .LBB1_1:<br class="">-; X64-NEXT: movq %rax, %rcx<br class="">-; X64-NEXT: shrq %rcx<br class="">-; X64-NEXT: andl $1, %eax<br class="">-; X64-NEXT: orq %rcx, %rax<br class=""> ; X64-NEXT: vcvtsi2ssq %rax, %xmm2, %xmm0<br class="">-; X64-NEXT: vaddss %xmm0, %xmm0, %xmm0<br class=""> ; X64-NEXT: retq<br class=""> %1 = and <2 x i64> %a0, <i64 65535, i64 -1><br class=""> %2 = extractelement <2 x i64> %1, i32 0<br class="">@@ -74,15 +54,15 @@ define <4 x i32> @knownbits_mask_shuffle<br class=""> ; X32-LABEL: knownbits_mask_shuffle_sext:<br class=""> ; X32: # BB#0:<br class=""> ; X32-NEXT: vpand {{\.LCPI.*}}, %xmm0, %xmm0<br class="">-; X32-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,3,0,1]<br class="">-; X32-NEXT: vpmovsxwd %xmm0, %xmm0<br class="">+; X32-NEXT: vpxor %xmm1, %xmm1, %xmm1<br class="">+; X32-NEXT: vpunpckhwd {{.*#+}} xmm0 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]<br class=""> ; X32-NEXT: retl<br class=""> ;<br class=""> ; X64-LABEL: knownbits_mask_shuffle_sext:<br class=""> ; X64: # BB#0:<br class=""> ; X64-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0<br class="">-; X64-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,3,0,1]<br class="">-; X64-NEXT: vpmovsxwd %xmm0, %xmm0<br class="">+; X64-NEXT: vpxor %xmm1, %xmm1, %xmm1<br class="">+; X64-NEXT: vpunpckhwd {{.*#+}} xmm0 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]<br class=""> ; X64-NEXT: retq<br class=""> %1 = and <8 x i16> %a0, <i16 -1, i16 -1, i16 -1, i16 -1, i16 15, i16 15, i16 15, i16 15><br class=""> %2 = shufflevector <8 x i16> %1, <8 x i16> undef, <4 x i32> <i32 4, i32 5, i32 6, i32 7><br class="">@@ -95,22 +75,14 @@ define <4 x float> @knownbits_mask_shuff<br class=""> ; X32: # BB#0:<br class=""> ; X32-NEXT: vpand {{\.LCPI.*}}, %xmm0, %xmm0<br class=""> ; X32-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]<br class="">-; X32-NEXT: vpblendw {{.*#+}} xmm1 = xmm0[0],mem[1],xmm0[2],mem[3],xmm0[4],mem[5],xmm0[6],mem[7]<br class="">-; X32-NEXT: vpsrld $16, %xmm0, %xmm0<br class="">-; X32-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],mem[1],xmm0[2],mem[3],xmm0[4],mem[5],xmm0[6],mem[7]<br class="">-; X32-NEXT: vaddps {{\.LCPI.*}}, %xmm0, %xmm0<br class="">-; X32-NEXT: vaddps %xmm0, %xmm1, %xmm0<br class="">+; X32-NEXT: vcvtdq2ps %xmm0, %xmm0<br class=""> ; X32-NEXT: retl<br class=""> ;<br class=""> ; X64-LABEL: knownbits_mask_shuffle_uitofp:<br class=""> ; X64: # BB#0:<br class=""> ; X64-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0<br class=""> ; X64-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]<br class="">-; X64-NEXT: vpblendw {{.*#+}} xmm1 = xmm0[0],mem[1],xmm0[2],mem[3],xmm0[4],mem[5],xmm0[6],mem[7]<br class="">-; X64-NEXT: vpsrld $16, %xmm0, %xmm0<br class="">-; X64-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],mem[1],xmm0[2],mem[3],xmm0[4],mem[5],xmm0[6],mem[7]<br class="">-; X64-NEXT: vaddps {{.*}}(%rip), %xmm0, %xmm0<br class="">-; X64-NEXT: vaddps %xmm0, %xmm1, %xmm0<br class="">+; X64-NEXT: vcvtdq2ps %xmm0, %xmm0<br class=""> ; X64-NEXT: retq<br class=""> %1 = and <4 x i32> %a0, <i32 -1, i32 -1, i32 255, i32 4085><br class=""> %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3><br class=""><br class=""><br class="">_______________________________________________<br class="">llvm-commits mailing list<br class=""><a href="mailto:llvm-commits@lists.llvm.org" class="">llvm-commits@lists.llvm.org</a><br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits<br class=""></div></div></blockquote></div><br class=""></div></body></html>