[llvm] r212610 - [x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening
Tom Stellard
tom at stellard.net
Wed Jul 9 07:28:48 PDT 2014
On Wed, Jul 09, 2014 at 10:58:19AM -0000, Chandler Carruth wrote:
> Author: chandlerc
> Date: Wed Jul 9 05:58:18 2014
> New Revision: 212610
>
> URL: http://llvm.org/viewvc/llvm-project?rev=212610&view=rev
> Log:
> [x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening
> vector types to be legal and a ZERO_EXTEND node is encountered.
>
Is this node legal by default? If so, you should alert the target
maintainers so they can update their backends.
-Tom
> When we use widening to legalize vector types, extend nodes are a real
> challenge. Either the input or output is likely to be legal, but in many
> cases not both. As a consequence, we don't really have any way to
> represent this situation and the prior code in the widening legalization
> framework would just scalarize the extend operation completely.
>
> This patch introduces a new DAG node to represent doing a zero extend of
> a vector "in register". The core of the idea is to allow legal but
> different vector types in the input and output. The output vector must
> have fewer lanes but wider elements. The operation is defined to zero
> extend the low elements of the input to the size of the output elements,
> and drop all of the high elements which don't have a corresponding lane
> in the output vector.
>
> It also includes generic expansion of this node in terms of blending
> a zero vector into the high elements of the vector and bitcasting
> across. This in turn yields extremely nice code for x86 SSE2 when we use
> the new widening legalization logic in conjunction with the new shuffle
> lowering logic.
>
> There is still more to do here. We need to support sign extension, any
> extension, and potentially int-to-float conversions. My current plan is
> to continue using similar synthetic nodes to model each of these
> transitions with generic lowering code for each one.
>
> However, with this patch LLVM already reaches performance parity with
> GCC for the core C loops of the x264 code (assuming you disable the
> hand-written assembly versions) when compiling for SSE2 and SSE3
> architectures and enabling the new widening and lowering logic for
> vectors.
>
> Differential Revision: http://reviews.llvm.org/D4405
>
> Added:
> llvm/trunk/test/CodeGen/X86/widen_conversions.ll
> Modified:
> llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h
> llvm/trunk/include/llvm/CodeGen/SelectionDAG.h
> llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h
> llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
> llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
> llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
> llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
> llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>
> Modified: llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h (original)
> +++ llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h Wed Jul 9 05:58:18 2014
> @@ -379,6 +379,15 @@ namespace ISD {
> /// operand, a ValueType node.
> SIGN_EXTEND_INREG,
>
> + /// ZERO_EXTEND_VECTOR_INREG(Vector) - This operator represents an
> + /// in-register zero-extension of the low lanes of an integer vector. The
> + /// result type must have fewer elements than the operand type, and those
> + /// elements must be larger integer types such that the total size of the
> + /// operand type and the result type match. Each of the low operand
> + /// elements is zero-extended into the corresponding, wider result
> + /// elements.
> + ZERO_EXTEND_VECTOR_INREG,
> +
> /// FP_TO_[US]INT - Convert a floating point value to a signed or unsigned
> /// integer.
> FP_TO_SINT,
>
> Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAG.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAG.h?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/include/llvm/CodeGen/SelectionDAG.h (original)
> +++ llvm/trunk/include/llvm/CodeGen/SelectionDAG.h Wed Jul 9 05:58:18 2014
> @@ -562,6 +562,12 @@ public:
> /// value assuming it was the smaller SrcTy value.
> SDValue getZeroExtendInReg(SDValue Op, SDLoc DL, EVT SrcTy);
>
> + /// getZeroExtendVectorInReg - Return an operation which will zero extend the
> + /// low lanes of the operand into the specified vector type. For example,
> + /// this can convert a v16i8 into a v4i32 by zero extending the low four
> + /// lanes of the operand from i8 to i32.
> + SDValue getZeroExtendVectorInReg(SDValue Op, SDLoc DL, EVT VT);
> +
> /// getBoolExtOrTrunc - Convert Op, which must be of integer type, to the
> /// integer type VT, by using an extension appropriate for the target's
> /// BooleanContent or truncating it.
>
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h Wed Jul 9 05:58:18 2014
> @@ -649,6 +649,7 @@ private:
> SDValue WidenVecOp_EXTRACT_SUBVECTOR(SDNode *N);
> SDValue WidenVecOp_STORE(SDNode* N);
> SDValue WidenVecOp_SETCC(SDNode* N);
> + SDValue WidenVecOp_ZERO_EXTEND(SDNode *N);
>
> SDValue WidenVecOp_Convert(SDNode *N);
>
>
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp Wed Jul 9 05:58:18 2014
> @@ -75,6 +75,12 @@ class VectorLegalizer {
> /// \brief Implement expansion for SIGN_EXTEND_INREG using SRL and SRA.
> SDValue ExpandSEXTINREG(SDValue Op);
>
> + /// \brief Implement expansion for ZERO_EXTEND_VECTOR_INREG.
> + ///
> + /// Shuffles the low lanes of the operand into place and blends zeros into
> + /// the remaining lanes, finally bitcasting to the proper type.
> + SDValue ExpandZERO_EXTEND_VECTOR_INREG(SDValue Op);
> +
> /// \brief Expand bswap of vectors into a shuffle if legal.
> SDValue ExpandBSWAP(SDValue Op);
>
> @@ -274,6 +280,7 @@ SDValue VectorLegalizer::LegalizeOp(SDVa
> case ISD::FP_EXTEND:
> case ISD::FMA:
> case ISD::SIGN_EXTEND_INREG:
> + case ISD::ZERO_EXTEND_VECTOR_INREG:
> QueryType = Node->getValueType(0);
> break;
> case ISD::FP_ROUND_INREG:
> @@ -614,6 +621,8 @@ SDValue VectorLegalizer::Expand(SDValue
> switch (Op->getOpcode()) {
> case ISD::SIGN_EXTEND_INREG:
> return ExpandSEXTINREG(Op);
> + case ISD::ZERO_EXTEND_VECTOR_INREG:
> + return ExpandZERO_EXTEND_VECTOR_INREG(Op);
> case ISD::BSWAP:
> return ExpandBSWAP(Op);
> case ISD::VSELECT:
> @@ -708,6 +717,39 @@ SDValue VectorLegalizer::ExpandSEXTINREG
> return DAG.getNode(ISD::SRA, DL, VT, Op, ShiftSz);
> }
>
> +// Generically expand a vector zext in register to a shuffle of the relevant
> +// lanes into the appropriate locations, a blend of zero into the high bits,
> +// and a bitcast to the wider element type.
> +SDValue VectorLegalizer::ExpandZERO_EXTEND_VECTOR_INREG(SDValue Op) {
> + SDLoc DL(Op);
> + EVT VT = Op.getValueType();
> + int NumElements = VT.getVectorNumElements();
> + SDValue Src = Op.getOperand(0);
> + EVT SrcVT = Src.getValueType();
> + int NumSrcElements = SrcVT.getVectorNumElements();
> +
> + // Build up a zero vector to blend into this one.
> + EVT SrcScalarVT = SrcVT.getScalarType();
> + SDValue ScalarZero = DAG.getTargetConstant(0, SrcScalarVT);
> + SmallVector<SDValue, 4> BuildVectorOperands(NumSrcElements, ScalarZero);
> + SDValue Zero = DAG.getNode(ISD::BUILD_VECTOR, DL, SrcVT, BuildVectorOperands);
> +
> + // Shuffle the incoming lanes into the correct position, and pull all other
> + // lanes from the zero vector.
> + SmallVector<int, 16> ShuffleMask;
> + ShuffleMask.reserve(NumSrcElements);
> + for (int i = 0; i < NumSrcElements; ++i)
> + ShuffleMask.push_back(i);
> +
> + int ExtLaneScale = NumSrcElements / NumElements;
> + int EndianOffset = TLI.isBigEndian() ? ExtLaneScale - 1 : 0;
> + for (int i = 0; i < NumElements; ++i)
> + ShuffleMask[i * ExtLaneScale + EndianOffset] = NumSrcElements + i;
> +
> + return DAG.getNode(ISD::BITCAST, DL, VT,
> + DAG.getVectorShuffle(SrcVT, DL, Zero, Src, ShuffleMask));
> +}
> +
> SDValue VectorLegalizer::ExpandBSWAP(SDValue Op) {
> EVT VT = Op.getValueType();
>
>
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp Wed Jul 9 05:58:18 2014
> @@ -2380,6 +2380,7 @@ bool DAGTypeLegalizer::WidenVectorOperan
> case ISD::EXTRACT_VECTOR_ELT: Res = WidenVecOp_EXTRACT_VECTOR_ELT(N); break;
> case ISD::STORE: Res = WidenVecOp_STORE(N); break;
> case ISD::SETCC: Res = WidenVecOp_SETCC(N); break;
> + case ISD::ZERO_EXTEND: Res = WidenVecOp_ZERO_EXTEND(N); break;
>
> case ISD::FP_EXTEND:
> case ISD::FP_TO_SINT:
> @@ -2388,7 +2389,6 @@ bool DAGTypeLegalizer::WidenVectorOperan
> case ISD::UINT_TO_FP:
> case ISD::TRUNCATE:
> case ISD::SIGN_EXTEND:
> - case ISD::ZERO_EXTEND:
> case ISD::ANY_EXTEND:
> Res = WidenVecOp_Convert(N);
> break;
> @@ -2410,6 +2410,26 @@ bool DAGTypeLegalizer::WidenVectorOperan
> return false;
> }
>
> +SDValue DAGTypeLegalizer::WidenVecOp_ZERO_EXTEND(SDNode *N) {
> + SDLoc DL(N);
> + EVT VT = N->getValueType(0);
> + unsigned NumElts = VT.getVectorNumElements();
> +
> + SDValue InOp = N->getOperand(0);
> + // If some legalization strategy other than widening is used on the operand,
> + // we can't safely assume that just zero-extending the low lanes is the
> + // correct transformation.
> + if (getTypeAction(InOp.getValueType()) != TargetLowering::TypeWidenVector)
> + return WidenVecOp_Convert(N);
> + InOp = GetWidenedVector(InOp);
> + EVT InVT = InOp.getValueType();
> + assert(NumElts < InVT.getVectorNumElements() && "Input wasn't widened!");
> +
> + // Use a special DAG node to represent the operation of zero extending the
> + // low lanes.
> + return DAG.getZeroExtendVectorInReg(InOp, DL, VT);
> +}
> +
> SDValue DAGTypeLegalizer::WidenVecOp_Convert(SDNode *N) {
> // Since the result is legal and the input is illegal, it is unlikely
> // that we can fix the input to a legal type so unroll the convert
>
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Wed Jul 9 05:58:18 2014
> @@ -1032,6 +1032,13 @@ SDValue SelectionDAG::getZeroExtendInReg
> getConstant(Imm, Op.getValueType()));
> }
>
> +SDValue SelectionDAG::getZeroExtendVectorInReg(SDValue Op, SDLoc DL, EVT VT) {
> + assert(VT.isVector() && "This DAG node is restricted to vector types.");
> + assert(VT.getVectorNumElements() < Op.getValueType().getVectorNumElements() &&
> + "The destination vector type must have fewer lanes than the input.");
> + return getNode(ISD::ZERO_EXTEND_VECTOR_INREG, DL, VT, Op);
> +}
> +
> /// getNOT - Create a bitwise NOT operation as (XOR Val, -1).
> ///
> SDValue SelectionDAG::getNOT(SDLoc DL, SDValue Val, EVT VT) {
>
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp Wed Jul 9 05:58:18 2014
> @@ -221,6 +221,7 @@ std::string SDNode::getOperationName(con
> case ISD::ZERO_EXTEND: return "zero_extend";
> case ISD::ANY_EXTEND: return "any_extend";
> case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";
> + case ISD::ZERO_EXTEND_VECTOR_INREG: return "zero_extend_vector_inreg";
> case ISD::TRUNCATE: return "truncate";
> case ISD::FP_ROUND: return "fp_round";
> case ISD::FLT_ROUNDS_: return "flt_rounds";
>
> Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Jul 9 05:58:18 2014
> @@ -869,6 +869,7 @@ void X86TargetLowering::resetOperationAc
> setOperationAction(ISD::TRUNCATE, VT, Expand);
> setOperationAction(ISD::SIGN_EXTEND, VT, Expand);
> setOperationAction(ISD::ZERO_EXTEND, VT, Expand);
> + setOperationAction(ISD::ZERO_EXTEND_VECTOR_INREG, VT, Expand);
> setOperationAction(ISD::ANY_EXTEND, VT, Expand);
> setOperationAction(ISD::VSELECT, VT, Expand);
> setOperationAction(ISD::SELECT_CC, VT, Expand);
>
> Added: llvm/trunk/test/CodeGen/X86/widen_conversions.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_conversions.ll?rev=212610&view=auto
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/widen_conversions.ll (added)
> +++ llvm/trunk/test/CodeGen/X86/widen_conversions.ll Wed Jul 9 05:58:18 2014
> @@ -0,0 +1,18 @@
> +; RUN: llc < %s -mcpu=x86-64 -x86-experimental-vector-widening-legalization -x86-experimental-vector-shuffle-lowering | FileCheck %s
> +
> +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
> +target triple = "x86_64-unknown-unknown"
> +
> +define <4 x i32> @zext_v4i8_to_v4i32(<4 x i8>* %ptr) {
> +; CHECK-LABEL: zext_v4i8_to_v4i32:
> +;
> +; CHECK: movd (%{{.*}}), %[[X:xmm[0-9]+]]
> +; CHECK-NEXT: pxor %[[Z:xmm[0-9]+]], %[[Z]]
> +; CHECK-NEXT: punpcklbw %[[Z]], %[[X]]
> +; CHECK-NEXT: punpcklbw %[[Z]], %[[X]]
> +; CHECK-NEXT: ret
> +
> + %val = load <4 x i8>* %ptr
> + %ext = zext <4 x i8> %val to <4 x i32>
> + ret <4 x i32> %ext
> +}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list