[llvm] r212610 - [x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening

Wed Jul 9 07:28:48 PDT 2014

On Wed, Jul 09, 2014 at 10:58:19AM -0000, Chandler Carruth wrote:
> Author: chandlerc
> Date: Wed Jul  9 05:58:18 2014
> New Revision: 212610
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=212610&view=rev
> Log:
> [x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening
> vector types to be legal and a ZERO_EXTEND node is encountered.
>

Is this node legal by default?  If so, you should alert the target
maintainers so they can update their backends.

-Tom

> When we use widening to legalize vector types, extend nodes are a real
> challenge. Either the input or output is likely to be legal, but in many
> cases not both. As a consequence, we don't really have any way to
> represent this situation and the prior code in the widening legalization
> framework would just scalarize the extend operation completely.
> 
> This patch introduces a new DAG node to represent doing a zero extend of
> a vector "in register". The core of the idea is to allow legal but
> different vector types in the input and output. The output vector must
> have fewer lanes but wider elements. The operation is defined to zero
> extend the low elements of the input to the size of the output elements,
> and drop all of the high elements which don't have a corresponding lane
> in the output vector.
> 
> It also includes generic expansion of this node in terms of blending
> a zero vector into the high elements of the vector and bitcasting
> across. This in turn yields extremely nice code for x86 SSE2 when we use
> the new widening legalization logic in conjunction with the new shuffle
> lowering logic.
> 
> There is still more to do here. We need to support sign extension, any
> extension, and potentially int-to-float conversions. My current plan is
> to continue using similar synthetic nodes to model each of these
> transitions with generic lowering code for each one.
> 
> However, with this patch LLVM already reaches performance parity with
> GCC for the core C loops of the x264 code (assuming you disable the
> hand-written assembly versions) when compiling for SSE2 and SSE3
> architectures and enabling the new widening and lowering logic for
> vectors.
> 
> Differential Revision: http://reviews.llvm.org/D4405
> 
> Added:
>     llvm/trunk/test/CodeGen/X86/widen_conversions.ll
> Modified:
>     llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h
>     llvm/trunk/include/llvm/CodeGen/SelectionDAG.h
>     llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h
>     llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
>     llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
>     llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>     llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
>     llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> 
> Modified: llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h (original)
> +++ llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h Wed Jul  9 05:58:18 2014
> @@ -379,6 +379,15 @@ namespace ISD {
>      /// operand, a ValueType node.
>      SIGN_EXTEND_INREG,
>  
> +    /// ZERO_EXTEND_VECTOR_INREG(Vector) - This operator represents an
> +    /// in-register zero-extension of the low lanes of an integer vector. The
> +    /// result type must have fewer elements than the operand type, and those
> +    /// elements must be larger integer types such that the total size of the
> +    /// operand type and the result type match. Each of the low operand
> +    /// elements is zero-extended into the corresponding, wider result
> +    /// elements.
> +    ZERO_EXTEND_VECTOR_INREG,
> +
>      /// FP_TO_[US]INT - Convert a floating point value to a signed or unsigned
>      /// integer.
>      FP_TO_SINT,
> 
> Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAG.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAG.h?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/include/llvm/CodeGen/SelectionDAG.h (original)
> +++ llvm/trunk/include/llvm/CodeGen/SelectionDAG.h Wed Jul  9 05:58:18 2014
> @@ -562,6 +562,12 @@ public:
>    /// value assuming it was the smaller SrcTy value.
>    SDValue getZeroExtendInReg(SDValue Op, SDLoc DL, EVT SrcTy);
>  
> +  /// getZeroExtendVectorInReg - Return an operation which will zero extend the
> +  /// low lanes of the operand into the specified vector type. For example,
> +  /// this can convert a v16i8 into a v4i32 by zero extending the low four
> +  /// lanes of the operand from i8 to i32.
> +  SDValue getZeroExtendVectorInReg(SDValue Op, SDLoc DL, EVT VT);
> +
>    /// getBoolExtOrTrunc - Convert Op, which must be of integer type, to the
>    /// integer type VT, by using an extension appropriate for the target's
>    /// BooleanContent or truncating it.
> 
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h Wed Jul  9 05:58:18 2014
> @@ -649,6 +649,7 @@ private:
>    SDValue WidenVecOp_EXTRACT_SUBVECTOR(SDNode *N);
>    SDValue WidenVecOp_STORE(SDNode* N);
>    SDValue WidenVecOp_SETCC(SDNode* N);
> +  SDValue WidenVecOp_ZERO_EXTEND(SDNode *N);
>  
>    SDValue WidenVecOp_Convert(SDNode *N);
>  
> 
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp Wed Jul  9 05:58:18 2014
> @@ -75,6 +75,12 @@ class VectorLegalizer {
>    /// \brief Implement expansion for SIGN_EXTEND_INREG using SRL and SRA.
>    SDValue ExpandSEXTINREG(SDValue Op);
>  
> +  /// \brief Implement expansion for ZERO_EXTEND_VECTOR_INREG.
> +  ///
> +  /// Shuffles the low lanes of the operand into place and blends zeros into
> +  /// the remaining lanes, finally bitcasting to the proper type.
> +  SDValue ExpandZERO_EXTEND_VECTOR_INREG(SDValue Op);
> +
>    /// \brief Expand bswap of vectors into a shuffle if legal.
>    SDValue ExpandBSWAP(SDValue Op);
>  
> @@ -274,6 +280,7 @@ SDValue VectorLegalizer::LegalizeOp(SDVa
>    case ISD::FP_EXTEND:
>    case ISD::FMA:
>    case ISD::SIGN_EXTEND_INREG:
> +  case ISD::ZERO_EXTEND_VECTOR_INREG:
>      QueryType = Node->getValueType(0);
>      break;
>    case ISD::FP_ROUND_INREG:
> @@ -614,6 +621,8 @@ SDValue VectorLegalizer::Expand(SDValue
>    switch (Op->getOpcode()) {
>    case ISD::SIGN_EXTEND_INREG:
>      return ExpandSEXTINREG(Op);
> +  case ISD::ZERO_EXTEND_VECTOR_INREG:
> +    return ExpandZERO_EXTEND_VECTOR_INREG(Op);
>    case ISD::BSWAP:
>      return ExpandBSWAP(Op);
>    case ISD::VSELECT:
> @@ -708,6 +717,39 @@ SDValue VectorLegalizer::ExpandSEXTINREG
>    return DAG.getNode(ISD::SRA, DL, VT, Op, ShiftSz);
>  }
>  
> +// Generically expand a vector zext in register to a shuffle of the relevant
> +// lanes into the appropriate locations, a blend of zero into the high bits,
> +// and a bitcast to the wider element type.
> +SDValue VectorLegalizer::ExpandZERO_EXTEND_VECTOR_INREG(SDValue Op) {
> +  SDLoc DL(Op);
> +  EVT VT = Op.getValueType();
> +  int NumElements = VT.getVectorNumElements();
> +  SDValue Src = Op.getOperand(0);
> +  EVT SrcVT = Src.getValueType();
> +  int NumSrcElements = SrcVT.getVectorNumElements();
> +
> +  // Build up a zero vector to blend into this one.
> +  EVT SrcScalarVT = SrcVT.getScalarType();
> +  SDValue ScalarZero = DAG.getTargetConstant(0, SrcScalarVT);
> +  SmallVector<SDValue, 4> BuildVectorOperands(NumSrcElements, ScalarZero);
> +  SDValue Zero = DAG.getNode(ISD::BUILD_VECTOR, DL, SrcVT, BuildVectorOperands);
> +
> +  // Shuffle the incoming lanes into the correct position, and pull all other
> +  // lanes from the zero vector.
> +  SmallVector<int, 16> ShuffleMask;
> +  ShuffleMask.reserve(NumSrcElements);
> +  for (int i = 0; i < NumSrcElements; ++i)
> +    ShuffleMask.push_back(i);
> +
> +  int ExtLaneScale = NumSrcElements / NumElements;
> +  int EndianOffset = TLI.isBigEndian() ? ExtLaneScale - 1 : 0;
> +  for (int i = 0; i < NumElements; ++i)
> +    ShuffleMask[i * ExtLaneScale + EndianOffset] = NumSrcElements + i;
> +
> +  return DAG.getNode(ISD::BITCAST, DL, VT,
> +                     DAG.getVectorShuffle(SrcVT, DL, Zero, Src, ShuffleMask));
> +}
> +
>  SDValue VectorLegalizer::ExpandBSWAP(SDValue Op) {
>    EVT VT = Op.getValueType();
>  
> 
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp Wed Jul  9 05:58:18 2014
> @@ -2380,6 +2380,7 @@ bool DAGTypeLegalizer::WidenVectorOperan
>    case ISD::EXTRACT_VECTOR_ELT: Res = WidenVecOp_EXTRACT_VECTOR_ELT(N); break;
>    case ISD::STORE:              Res = WidenVecOp_STORE(N); break;
>    case ISD::SETCC:              Res = WidenVecOp_SETCC(N); break;
> +  case ISD::ZERO_EXTEND:        Res = WidenVecOp_ZERO_EXTEND(N); break;
>  
>    case ISD::FP_EXTEND:
>    case ISD::FP_TO_SINT:
> @@ -2388,7 +2389,6 @@ bool DAGTypeLegalizer::WidenVectorOperan
>    case ISD::UINT_TO_FP:
>    case ISD::TRUNCATE:
>    case ISD::SIGN_EXTEND:
> -  case ISD::ZERO_EXTEND:
>    case ISD::ANY_EXTEND:
>      Res = WidenVecOp_Convert(N);
>      break;
> @@ -2410,6 +2410,26 @@ bool DAGTypeLegalizer::WidenVectorOperan
>    return false;
>  }
>  
> +SDValue DAGTypeLegalizer::WidenVecOp_ZERO_EXTEND(SDNode *N) {
> +  SDLoc DL(N);
> +  EVT VT = N->getValueType(0);
> +  unsigned NumElts = VT.getVectorNumElements();
> +
> +  SDValue InOp = N->getOperand(0);
> +  // If some legalization strategy other than widening is used on the operand,
> +  // we can't safely assume that just zero-extending the low lanes is the
> +  // correct transformation.
> +  if (getTypeAction(InOp.getValueType()) != TargetLowering::TypeWidenVector)
> +    return WidenVecOp_Convert(N);
> +  InOp = GetWidenedVector(InOp);
> +  EVT InVT = InOp.getValueType();
> +  assert(NumElts < InVT.getVectorNumElements() && "Input wasn't widened!");
> +
> +  // Use a special DAG node to represent the operation of zero extending the
> +  // low lanes.
> +  return DAG.getZeroExtendVectorInReg(InOp, DL, VT);
> +}
> +
>  SDValue DAGTypeLegalizer::WidenVecOp_Convert(SDNode *N) {
>    // Since the result is legal and the input is illegal, it is unlikely
>    // that we can fix the input to a legal type so unroll the convert
> 
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Wed Jul  9 05:58:18 2014
> @@ -1032,6 +1032,13 @@ SDValue SelectionDAG::getZeroExtendInReg
>                   getConstant(Imm, Op.getValueType()));
>  }
>  
> +SDValue SelectionDAG::getZeroExtendVectorInReg(SDValue Op, SDLoc DL, EVT VT) {
> +  assert(VT.isVector() && "This DAG node is restricted to vector types.");
> +  assert(VT.getVectorNumElements() < Op.getValueType().getVectorNumElements() &&
> +         "The destination vector type must have fewer lanes than the input.");
> +  return getNode(ISD::ZERO_EXTEND_VECTOR_INREG, DL, VT, Op);
> +}
> +
>  /// getNOT - Create a bitwise NOT operation as (XOR Val, -1).
>  ///
>  SDValue SelectionDAG::getNOT(SDLoc DL, SDValue Val, EVT VT) {
> 
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp Wed Jul  9 05:58:18 2014
> @@ -221,6 +221,7 @@ std::string SDNode::getOperationName(con
>    case ISD::ZERO_EXTEND:                return "zero_extend";
>    case ISD::ANY_EXTEND:                 return "any_extend";
>    case ISD::SIGN_EXTEND_INREG:          return "sign_extend_inreg";
> +  case ISD::ZERO_EXTEND_VECTOR_INREG:   return "zero_extend_vector_inreg";
>    case ISD::TRUNCATE:                   return "truncate";
>    case ISD::FP_ROUND:                   return "fp_round";
>    case ISD::FLT_ROUNDS_:                return "flt_rounds";
> 
> Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=212610&r1=212609&r2=212610&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Jul  9 05:58:18 2014
> @@ -869,6 +869,7 @@ void X86TargetLowering::resetOperationAc
>      setOperationAction(ISD::TRUNCATE, VT, Expand);
>      setOperationAction(ISD::SIGN_EXTEND, VT, Expand);
>      setOperationAction(ISD::ZERO_EXTEND, VT, Expand);
> +    setOperationAction(ISD::ZERO_EXTEND_VECTOR_INREG, VT, Expand);
>      setOperationAction(ISD::ANY_EXTEND, VT, Expand);
>      setOperationAction(ISD::VSELECT, VT, Expand);
>      setOperationAction(ISD::SELECT_CC, VT, Expand);
> 
> Added: llvm/trunk/test/CodeGen/X86/widen_conversions.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_conversions.ll?rev=212610&view=auto
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/widen_conversions.ll (added)
> +++ llvm/trunk/test/CodeGen/X86/widen_conversions.ll Wed Jul  9 05:58:18 2014
> @@ -0,0 +1,18 @@
> +; RUN: llc < %s -mcpu=x86-64 -x86-experimental-vector-widening-legalization -x86-experimental-vector-shuffle-lowering | FileCheck %s
> +
> +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
> +target triple = "x86_64-unknown-unknown"
> +
> +define <4 x i32> @zext_v4i8_to_v4i32(<4 x i8>* %ptr) {
> +; CHECK-LABEL: zext_v4i8_to_v4i32:
> +; 
> +; CHECK:      movd (%{{.*}}), %[[X:xmm[0-9]+]]
> +; CHECK-NEXT: pxor %[[Z:xmm[0-9]+]], %[[Z]]
> +; CHECK-NEXT: punpcklbw %[[Z]], %[[X]]
> +; CHECK-NEXT: punpcklbw %[[Z]], %[[X]]
> +; CHECK-NEXT: ret
> +
> +  %val = load <4 x i8>* %ptr
> +  %ext = zext <4 x i8> %val to <4 x i32>
> +  ret <4 x i32> %ext
> +}
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits