[llvm] r285296 - [DAGCombiner] Add vector demanded elements support to computeKnownBits
Juergen Ributzka via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 27 21:11:42 PDT 2016
Reverted in r285381
—Juergen
> On Oct 27, 2016, at 3:36 PM, Juergen Ributzka <juergen at ributzka.de> wrote:
>
> Hi Simon,
>
> I think this commit is causing timeouts on our LTO builds on green dragon:
> http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/10676/ <http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/10676/>
>
> A build usually takes 45min and we kill them if they last more than 90min.
>
> —Juergen
>
>> On Oct 27, 2016, at 7:29 AM, Simon Pilgrim via llvm-commits <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>>
>> Author: rksimon
>> Date: Thu Oct 27 09:29:28 2016
>> New Revision: 285296
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=285296&view=rev <http://llvm.org/viewvc/llvm-project?rev=285296&view=rev>
>> Log:
>> [DAGCombiner] Add vector demanded elements support to computeKnownBits
>>
>> Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements.
>>
>> This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.
>>
>> The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used.
>>
>> I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course.
>>
>> DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit.
>>
>> Differential Revision: https://reviews.llvm.org/D25691 <https://reviews.llvm.org/D25691>
>>
>> Modified:
>> llvm/trunk/include/llvm/CodeGen/SelectionDAG.h
>> llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>> llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll
>> llvm/trunk/test/CodeGen/X86/known-bits-vector.ll
>>
>> Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAG.h
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAG.h?rev=285296&r1=285295&r2=285296&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAG.h?rev=285296&r1=285295&r2=285296&view=diff>
>> ==============================================================================
>> --- llvm/trunk/include/llvm/CodeGen/SelectionDAG.h (original)
>> +++ llvm/trunk/include/llvm/CodeGen/SelectionDAG.h Thu Oct 27 09:29:28 2016
>> @@ -1256,12 +1256,22 @@ public:
>> const;
>>
>> /// Determine which bits of Op are known to be either zero or one and return
>> - /// them in the KnownZero/KnownOne bitsets. Targets can implement the
>> - /// computeKnownBitsForTargetNode method in the TargetLowering class to allow
>> - /// target nodes to be understood.
>> + /// them in the KnownZero/KnownOne bitsets. For vectors, the known bits are
>> + /// those that are shared by every vector element.
>> + /// Targets can implement the computeKnownBitsForTargetNode method in the
>> + /// TargetLowering class to allow target nodes to be understood.
>> void computeKnownBits(SDValue Op, APInt &KnownZero, APInt &KnownOne,
>> unsigned Depth = 0) const;
>>
>> + /// Determine which bits of Op are known to be either zero or one and return
>> + /// them in the KnownZero/KnownOne bitsets. The DemandedElts argument allows
>> + /// us to only collect the known bits that are shared by the requested vector
>> + /// elements.
>> + /// Targets can implement the computeKnownBitsForTargetNode method in the
>> + /// TargetLowering class to allow target nodes to be understood.
>> + void computeKnownBits(SDValue Op, APInt &KnownZero, APInt &KnownOne,
>> + const APInt &DemandedElts, unsigned Depth = 0) const;
>> +
>> /// Test if the given value is known to have exactly one bit set. This differs
>> /// from computeKnownBits in that it doesn't necessarily determine which bit
>> /// is set.
>>
>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=285296&r1=285295&r2=285296&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=285296&r1=285295&r2=285296&view=diff>
>> ==============================================================================
>> --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original)
>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Thu Oct 27 09:29:28 2016
>> @@ -2007,9 +2007,26 @@ static const APInt *getValidShiftAmountC
>> }
>>
>> /// Determine which bits of Op are known to be either zero or one and return
>> -/// them in the KnownZero/KnownOne bitsets.
>> +/// them in the KnownZero/KnownOne bitsets. For vectors, the known bits are
>> +/// those that are shared by every vector element.
>> void SelectionDAG::computeKnownBits(SDValue Op, APInt &KnownZero,
>> APInt &KnownOne, unsigned Depth) const {
>> + EVT VT = Op.getValueType();
>> + APInt DemandedElts = VT.isVector()
>> + ? APInt::getAllOnesValue(VT.getVectorNumElements())
>> + : APInt(1, 1);
>> + computeKnownBits(Op, KnownZero, KnownOne, DemandedElts, Depth);
>> +}
>> +
>> +/// Determine which bits of Op are known to be either zero or one and return
>> +/// them in the KnownZero/KnownOne bitsets. The DemandedElts argument allows
>> +/// us to only collect the known bits that are shared by the requested vector
>> +/// elements.
>> +/// TODO: We only support DemandedElts on a few opcodes so far, the remainder
>> +/// should be added when they become necessary.
>> +void SelectionDAG::computeKnownBits(SDValue Op, APInt &KnownZero,
>> + APInt &KnownOne, const APInt &DemandedElts,
>> + unsigned Depth) const {
>> unsigned BitWidth = Op.getScalarValueSizeInBits();
>>
>> KnownZero = KnownOne = APInt(BitWidth, 0); // Don't know anything.
>> @@ -2017,6 +2034,10 @@ void SelectionDAG::computeKnownBits(SDVa
>> return; // Limit search depth.
>>
>> APInt KnownZero2, KnownOne2;
>> + unsigned NumElts = DemandedElts.getBitWidth();
>> +
>> + if (DemandedElts == APInt(NumElts, 0))
>> + return; // No demanded elts, better to assume we don't know anything.
>>
>> switch (Op.getOpcode()) {
>> case ISD::Constant:
>> @@ -2025,9 +2046,15 @@ void SelectionDAG::computeKnownBits(SDVa
>> KnownZero = ~KnownOne;
>> break;
>> case ISD::BUILD_VECTOR:
>> - // Collect the known bits that are shared by every vector element.
>> + // Collect the known bits that are shared by every demanded vector element.
>> + assert(NumElts == Op.getValueType().getVectorNumElements() &&
>> + "Unexpected vector size");
>> KnownZero = KnownOne = APInt::getAllOnesValue(BitWidth);
>> - for (SDValue SrcOp : Op->ops()) {
>> + for (unsigned i = 0, e = Op.getNumOperands(); i != e; ++i) {
>> + if (!DemandedElts[i])
>> + continue;
>> +
>> + SDValue SrcOp = Op.getOperand(i);
>> computeKnownBits(SrcOp, KnownZero2, KnownOne2, Depth + 1);
>>
>> // BUILD_VECTOR can implicitly truncate sources, we must handle this.
>> @@ -2038,16 +2065,72 @@ void SelectionDAG::computeKnownBits(SDVa
>> KnownZero2 = KnownZero2.trunc(BitWidth);
>> }
>>
>> - // Known bits are the values that are shared by every element.
>> - // TODO: support per-element known bits.
>> + // Known bits are the values that are shared by every demanded element.
>> + KnownOne &= KnownOne2;
>> + KnownZero &= KnownZero2;
>> + }
>> + break;
>> + case ISD::VECTOR_SHUFFLE: {
>> + // Collect the known bits that are shared by every vector element referenced
>> + // by the shuffle.
>> + APInt DemandedLHS(NumElts, 0), DemandedRHS(NumElts, 0);
>> + KnownZero = KnownOne = APInt::getAllOnesValue(BitWidth);
>> + const ShuffleVectorSDNode *SVN = cast<ShuffleVectorSDNode>(Op);
>> + assert(NumElts == SVN->getMask().size() && "Unexpected vector size");
>> + for (unsigned i = 0; i != NumElts; ++i) {
>> + int M = SVN->getMaskElt(i);
>> + if (M < 0) {
>> + // For UNDEF elements, we don't know anything about the common state of
>> + // the shuffle result.
>> + // FIXME: Is this too pessimistic?
>> + KnownZero = KnownOne = APInt(BitWidth, 0);
>> + break;
>> + }
>> + if (!DemandedElts[i])
>> + continue;
>> +
>> + if ((unsigned)M < NumElts)
>> + DemandedLHS.setBit((unsigned)M % NumElts);
>> + else
>> + DemandedRHS.setBit((unsigned)M % NumElts);
>> + }
>> + // Known bits are the values that are shared by every demanded element.
>> + if (DemandedLHS != APInt(NumElts, 0)) {
>> + SDValue LHS = Op.getOperand(0);
>> + computeKnownBits(LHS, KnownZero2, KnownOne2, DemandedLHS, Depth + 1);
>> + KnownOne &= KnownOne2;
>> + KnownZero &= KnownZero2;
>> + }
>> + if (DemandedRHS != APInt(NumElts, 0)) {
>> + SDValue RHS = Op.getOperand(1);
>> + computeKnownBits(RHS, KnownZero2, KnownOne2, DemandedRHS, Depth + 1);
>> KnownOne &= KnownOne2;
>> KnownZero &= KnownZero2;
>> }
>> break;
>> + }
>> + case ISD::EXTRACT_SUBVECTOR: {
>> + // If we know the element index, just demand that subvector elements,
>> + // otherwise demand them all.
>> + SDValue Src = Op.getOperand(0);
>> + ConstantSDNode *SubIdx = dyn_cast<ConstantSDNode>(Op.getOperand(1));
>> + unsigned NumSrcElts = Src.getValueType().getVectorNumElements();
>> + if (SubIdx && SubIdx->getAPIntValue().ule(NumSrcElts - NumElts)) {
>> + // Offset the demanded elts by the subvector index.
>> + uint64_t Idx = SubIdx->getZExtValue();
>> + APInt DemandedSrc = DemandedElts.zext(NumSrcElts).shl(Idx);
>> + computeKnownBits(Src, KnownZero, KnownOne, DemandedSrc, Depth + 1);
>> + } else {
>> + computeKnownBits(Src, KnownZero, KnownOne, Depth + 1);
>> + }
>> + break;
>> + }
>> case ISD::AND:
>> // If either the LHS or the RHS are Zero, the result is zero.
>> - computeKnownBits(Op.getOperand(1), KnownZero, KnownOne, Depth+1);
>> - computeKnownBits(Op.getOperand(0), KnownZero2, KnownOne2, Depth+1);
>> + computeKnownBits(Op.getOperand(1), KnownZero, KnownOne, DemandedElts,
>> + Depth + 1);
>> + computeKnownBits(Op.getOperand(0), KnownZero2, KnownOne2, DemandedElts,
>> + Depth + 1);
>>
>> // Output known-1 bits are only known if set in both the LHS & RHS.
>> KnownOne &= KnownOne2;
>> @@ -2206,7 +2289,8 @@ void SelectionDAG::computeKnownBits(SDVa
>> if (NewBits.getBoolValue())
>> InputDemandedBits |= InSignBit;
>>
>> - computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, Depth+1);
>> + computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, DemandedElts,
>> + Depth + 1);
>> KnownOne &= InputDemandedBits;
>> KnownZero &= InputDemandedBits;
>>
>> @@ -2253,7 +2337,8 @@ void SelectionDAG::computeKnownBits(SDVa
>> APInt NewBits = APInt::getHighBitsSet(BitWidth, BitWidth - InBits);
>> KnownZero = KnownZero.trunc(InBits);
>> KnownOne = KnownOne.trunc(InBits);
>> - computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, Depth+1);
>> + computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, DemandedElts,
>> + Depth + 1);
>> KnownZero = KnownZero.zext(BitWidth);
>> KnownOne = KnownOne.zext(BitWidth);
>> KnownZero |= NewBits;
>> @@ -2266,7 +2351,8 @@ void SelectionDAG::computeKnownBits(SDVa
>>
>> KnownZero = KnownZero.trunc(InBits);
>> KnownOne = KnownOne.trunc(InBits);
>> - computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, Depth+1);
>> + computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, DemandedElts,
>> + Depth + 1);
>>
>> // Note if the sign bit is known to be zero or one.
>> bool SignBitKnownZero = KnownZero.isNegative();
>> @@ -2439,16 +2525,28 @@ void SelectionDAG::computeKnownBits(SDVa
>> // At the moment we keep this simple and skip tracking the specific
>> // element. This way we get the lowest common denominator for all elements
>> // of the vector.
>> - // TODO: get information for given vector element
>> + SDValue InVec = Op.getOperand(0);
>> + SDValue EltNo = Op.getOperand(1);
>> + EVT VecVT = InVec.getValueType();
>> const unsigned BitWidth = Op.getValueSizeInBits();
>> - const unsigned EltBitWidth = Op.getOperand(0).getScalarValueSizeInBits();
>> + const unsigned EltBitWidth = VecVT.getScalarSizeInBits();
>> + const unsigned NumSrcElts = VecVT.getVectorNumElements();
>> // If BitWidth > EltBitWidth the value is anyext:ed. So we do not know
>> // anything about the extended bits.
>> if (BitWidth > EltBitWidth) {
>> KnownZero = KnownZero.trunc(EltBitWidth);
>> KnownOne = KnownOne.trunc(EltBitWidth);
>> }
>> - computeKnownBits(Op.getOperand(0), KnownZero, KnownOne, Depth+1);
>> + ConstantSDNode *ConstEltNo = dyn_cast<ConstantSDNode>(EltNo);
>> + if (ConstEltNo && ConstEltNo->getAPIntValue().ult(NumSrcElts)) {
>> + // If we know the element index, just demand that vector element.
>> + unsigned Idx = ConstEltNo->getZExtValue();
>> + APInt DemandedElt = APInt::getOneBitSet(NumSrcElts, Idx);
>> + computeKnownBits(InVec, KnownZero, KnownOne, DemandedElt, Depth + 1);
>> + } else {
>> + // Unknown element index, so ignore DemandedElts and demand them all.
>> + computeKnownBits(InVec, KnownZero, KnownOne, Depth + 1);
>> + }
>> if (BitWidth > EltBitWidth) {
>> KnownZero = KnownZero.zext(BitWidth);
>> KnownOne = KnownOne.zext(BitWidth);
>>
>> Modified: llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll?rev=285296&r1=285295&r2=285296&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll?rev=285296&r1=285295&r2=285296&view=diff>
>> ==============================================================================
>> --- llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll (original)
>> +++ llvm/trunk/test/CodeGen/X86/avx-vperm2x128.ll Thu Oct 27 09:29:28 2016
>> @@ -465,11 +465,9 @@ define <4 x i64> @shuffle_v4i64_67zz(<4
>> ; AVX1-LABEL: shuffle_v4i64_67zz:
>> ; AVX1: ## BB#0:
>> ; AVX1-NEXT: vperm2f128 {{.*#+}} ymm0 = ymm0[2,3],zero,zero
>> -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2
>> -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3
>> -; AVX1-NEXT: vpaddq %xmm2, %xmm3, %xmm2
>> ; AVX1-NEXT: vpaddq %xmm0, %xmm1, %xmm0
>> -; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
>> +; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm1
>> +; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
>> ; AVX1-NEXT: retq
>> ;
>> ; AVX2-LABEL: shuffle_v4i64_67zz:
>>
>> Modified: llvm/trunk/test/CodeGen/X86/known-bits-vector.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/known-bits-vector.ll?rev=285296&r1=285295&r2=285296&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/known-bits-vector.ll?rev=285296&r1=285295&r2=285296&view=diff>
>> ==============================================================================
>> --- llvm/trunk/test/CodeGen/X86/known-bits-vector.ll (original)
>> +++ llvm/trunk/test/CodeGen/X86/known-bits-vector.ll Thu Oct 27 09:29:28 2016
>> @@ -5,16 +5,14 @@
>> define i32 @knownbits_mask_extract_sext(<8 x i16> %a0) nounwind {
>> ; X32-LABEL: knownbits_mask_extract_sext:
>> ; X32: # BB#0:
>> -; X32-NEXT: vandps {{\.LCPI.*}}, %xmm0, %xmm0
>> -; X32-NEXT: vmovd %xmm0, %eax
>> -; X32-NEXT: cwtl
>> +; X32-NEXT: vpand {{\.LCPI.*}}, %xmm0, %xmm0
>> +; X32-NEXT: vpextrw $0, %xmm0, %eax
>> ; X32-NEXT: retl
>> ;
>> ; X64-LABEL: knownbits_mask_extract_sext:
>> ; X64: # BB#0:
>> -; X64-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0
>> -; X64-NEXT: vmovd %xmm0, %eax
>> -; X64-NEXT: cwtl
>> +; X64-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0
>> +; X64-NEXT: vpextrw $0, %xmm0, %eax
>> ; X64-NEXT: retq
>> %1 = and <8 x i16> %a0, <i16 15, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1>
>> %2 = extractelement <8 x i16> %1, i32 0
>> @@ -31,17 +29,10 @@ define float @knownbits_mask_extract_uit
>> ; X32-NEXT: subl $16, %esp
>> ; X32-NEXT: vpxor %xmm1, %xmm1, %xmm1
>> ; X32-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1,2,3],xmm0[4,5,6,7]
>> -; X32-NEXT: vpextrd $1, %xmm0, %eax
>> ; X32-NEXT: vmovq %xmm0, {{[0-9]+}}(%esp)
>> -; X32-NEXT: xorl %ecx, %ecx
>> -; X32-NEXT: testl %eax, %eax
>> -; X32-NEXT: setns %cl
>> ; X32-NEXT: fildll {{[0-9]+}}(%esp)
>> -; X32-NEXT: fadds {{\.LCPI.*}}(,%ecx,4)
>> ; X32-NEXT: fstps {{[0-9]+}}(%esp)
>> -; X32-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
>> -; X32-NEXT: vmovss %xmm0, (%esp)
>> -; X32-NEXT: flds (%esp)
>> +; X32-NEXT: flds {{[0-9]+}}(%esp)
>> ; X32-NEXT: movl %ebp, %esp
>> ; X32-NEXT: popl %ebp
>> ; X32-NEXT: retl
>> @@ -51,18 +42,7 @@ define float @knownbits_mask_extract_uit
>> ; X64-NEXT: vpxor %xmm1, %xmm1, %xmm1
>> ; X64-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1,2,3],xmm0[4,5,6,7]
>> ; X64-NEXT: vmovq %xmm0, %rax
>> -; X64-NEXT: testq %rax, %rax
>> -; X64-NEXT: js .LBB1_1
>> -; X64-NEXT: # BB#2:
>> -; X64-NEXT: vcvtsi2ssq %rax, %xmm2, %xmm0
>> -; X64-NEXT: retq
>> -; X64-NEXT: .LBB1_1:
>> -; X64-NEXT: movq %rax, %rcx
>> -; X64-NEXT: shrq %rcx
>> -; X64-NEXT: andl $1, %eax
>> -; X64-NEXT: orq %rcx, %rax
>> ; X64-NEXT: vcvtsi2ssq %rax, %xmm2, %xmm0
>> -; X64-NEXT: vaddss %xmm0, %xmm0, %xmm0
>> ; X64-NEXT: retq
>> %1 = and <2 x i64> %a0, <i64 65535, i64 -1>
>> %2 = extractelement <2 x i64> %1, i32 0
>> @@ -74,15 +54,15 @@ define <4 x i32> @knownbits_mask_shuffle
>> ; X32-LABEL: knownbits_mask_shuffle_sext:
>> ; X32: # BB#0:
>> ; X32-NEXT: vpand {{\.LCPI.*}}, %xmm0, %xmm0
>> -; X32-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,3,0,1]
>> -; X32-NEXT: vpmovsxwd %xmm0, %xmm0
>> +; X32-NEXT: vpxor %xmm1, %xmm1, %xmm1
>> +; X32-NEXT: vpunpckhwd {{.*#+}} xmm0 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]
>> ; X32-NEXT: retl
>> ;
>> ; X64-LABEL: knownbits_mask_shuffle_sext:
>> ; X64: # BB#0:
>> ; X64-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0
>> -; X64-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,3,0,1]
>> -; X64-NEXT: vpmovsxwd %xmm0, %xmm0
>> +; X64-NEXT: vpxor %xmm1, %xmm1, %xmm1
>> +; X64-NEXT: vpunpckhwd {{.*#+}} xmm0 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]
>> ; X64-NEXT: retq
>> %1 = and <8 x i16> %a0, <i16 -1, i16 -1, i16 -1, i16 -1, i16 15, i16 15, i16 15, i16 15>
>> %2 = shufflevector <8 x i16> %1, <8 x i16> undef, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
>> @@ -95,22 +75,14 @@ define <4 x float> @knownbits_mask_shuff
>> ; X32: # BB#0:
>> ; X32-NEXT: vpand {{\.LCPI.*}}, %xmm0, %xmm0
>> ; X32-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]
>> -; X32-NEXT: vpblendw {{.*#+}} xmm1 = xmm0[0],mem[1],xmm0[2],mem[3],xmm0[4],mem[5],xmm0[6],mem[7]
>> -; X32-NEXT: vpsrld $16, %xmm0, %xmm0
>> -; X32-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],mem[1],xmm0[2],mem[3],xmm0[4],mem[5],xmm0[6],mem[7]
>> -; X32-NEXT: vaddps {{\.LCPI.*}}, %xmm0, %xmm0
>> -; X32-NEXT: vaddps %xmm0, %xmm1, %xmm0
>> +; X32-NEXT: vcvtdq2ps %xmm0, %xmm0
>> ; X32-NEXT: retl
>> ;
>> ; X64-LABEL: knownbits_mask_shuffle_uitofp:
>> ; X64: # BB#0:
>> ; X64-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0
>> ; X64-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]
>> -; X64-NEXT: vpblendw {{.*#+}} xmm1 = xmm0[0],mem[1],xmm0[2],mem[3],xmm0[4],mem[5],xmm0[6],mem[7]
>> -; X64-NEXT: vpsrld $16, %xmm0, %xmm0
>> -; X64-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],mem[1],xmm0[2],mem[3],xmm0[4],mem[5],xmm0[6],mem[7]
>> -; X64-NEXT: vaddps {{.*}}(%rip), %xmm0, %xmm0
>> -; X64-NEXT: vaddps %xmm0, %xmm1, %xmm0
>> +; X64-NEXT: vcvtdq2ps %xmm0, %xmm0
>> ; X64-NEXT: retq
>> %1 = and <4 x i32> %a0, <i32 -1, i32 -1, i32 255, i32 4085>
>> %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161027/83597b2a/attachment-0001.html>
More information about the llvm-commits
mailing list