[llvm] r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation

Wed Jul 22 01:54:29 PDT 2015

Hi Mikhail,

Thanks for the comments. Pls see the response inlined.

Regards,
Shahid

> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
> bounces at cs.uiuc.edu] On Behalf Of Mikhail Zolotukhin
> Sent: Tuesday, July 21, 2015 12:02 AM
> To: James Molloy
> Cc: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm] r242409 - [Codegen] Add intrinsics 'absdiff' and
> corresponding SDNodes for absolute difference operation
> 
> 
> > On Jul 16, 2015, at 8:22 AM, James Molloy <James.Molloy at arm.com>
> wrote:
> >
> > Author: jamesm
> > Date: Thu Jul 16 10:22:46 2015
> > New Revision: 242409
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=242409&view=rev
> > Log:
> > [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for
> > absolute difference operation
> >
> > This adds new intrinsics "*absdiff" for absolute difference ops to facilitate
> efficient code generation for "sum of absolute differences" operation.
> > The patch also contains the introduction of corresponding SDNodes and
> basic legalization support.Sanity of the generated code is tested on X86.
> >
> > This is 1st of the three patches.
> >
> > Patch by Shahid Asghar-ahmad!
> >
> > Added:
> >    llvm/trunk/test/CodeGen/X86/absdiff_expand.ll
> > Modified:
> >    llvm/trunk/docs/LangRef.rst
> >    llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h
> >    llvm/trunk/include/llvm/IR/Intrinsics.td
> >    llvm/trunk/include/llvm/Target/TargetSelectionDAG.td
> >    llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
> >    llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
> >    llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
> >    llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
> >    llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
> >    llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp
> >
> > Modified: llvm/trunk/docs/LangRef.rst
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LangRef.rst?rev=24
> > 2409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/docs/LangRef.rst (original)
> > +++ llvm/trunk/docs/LangRef.rst Thu Jul 16 10:22:46 2015
> > @@ -10328,6 +10328,65 @@ Examples:
> >
> >       %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
> > ; yields float:r2 = (a * b) + c
> >
> > +
> > +'``llvm.uabsdiff.*``' and '``llvm.sabsdiff.*``' Intrinsics
> >
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> ^
> > +
> > +Syntax:
> > +"""""""
> > +This is an overloaded intrinsic. The loaded data is a vector of any integer
> bit width.
> > +
> > +.. code-block:: llvm
> > +
> > +      declare <4 x integer> @llvm.uabsdiff.v4i32(<4 x integer> %a, <4
> > + x integer> %b)
> > +
> > +
> > +Overview:
> > +"""""""""
> > +
> > +The ``llvm.uabsdiff`` intrinsic returns a vector result of the
> > +absolute difference of the two operands, treating them both as unsigned
> integers.
> > +
> > +The ``llvm.sabsdiff`` intrinsic returns  a vector result of the
> > +absolute difference of the two operands, treating them both as signed
> integers.
> > +
> > +.. note::
> > +
> > +    These intrinsics are primarily used during the code generation stage of
> compilation.
> > +    They are generated by compiler passes such as the Loop and SLP
> vectorizers.it is not
> > +    recommended for users to create them manually.
> > +
> > +Arguments:
> > +""""""""""
> > +
> > +Both intrinsics take two integer of the same bitwidth.
> > +
> > +Semantics:
> > +""""""""""
> > +
> > +The expression::
> > +
> > +    call <4 x i32> @llvm.uabsdiff.v4i32(<4 x i32> %a, <4 x i32> %b)
> > +
> > +is equivalent to::
> > +
> > +    %sub = sub <4 x i32> %a, %b
> > +    %ispos = icmp ugt <4 x i32> %sub, <i32 -1, i32 -1, i32 -1, i32
> > + -1>
> Isn't it always 'false'?
Oh yes, this will always be false as the comparison is 'unsigned'.
Since the subtraction of two unsigned numbers can be a signed number, How about making this comparison as
"icmp sge <4 x i32> %sub, zeroinitializer"

> > +    %neg = sub <4 x i32> zeroinitializer, %sub
> > +    %1 = select <4 x i1> %ispos, <4 x i32> %sub, <4 x i32> %neg
> > +
> > +Similarly the expression::
> > +
> > +    call <4 x i32> @llvm.sabsdiff.v4i32(<4 x i32> %a, <4 x i32> %b)
> > +
> > +is equivalent to::
> > +
> > +    %sub = sub nsw <4 x i32> %a, %b
> > +    %ispos = icmp sgt <4 x i32> %sub, <i32 -1, i32 -1, i32 -1, i32
> > + -1>
> Wouldn't it be more readable if we use "icmp sge <4 x i32> %sub,
> zeroinitializer"?
Yes, that will do.

> > +    %neg = sub nsw <4 x i32> zeroinitializer, %sub
> > +    %1 = select <4 x i1> %ispos, <4 x i32> %sub, <4 x i32> %neg
> > +
> > +
> > Half Precision Floating Point Intrinsics
> > ----------------------------------------
> >
> >
> > Modified: llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/IS
> > DOpcodes.h?rev=242409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h (original)
> > +++ llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h Thu Jul 16 10:22:46
> > +++ 2015
> > @@ -334,6 +334,10 @@ namespace ISD {
> >     /// Byte Swap and Counting operators.
> >     BSWAP, CTTZ, CTLZ, CTPOP,
> >
> > +    /// [SU]ABSDIFF - Signed/Unsigned absolute difference of two input
> integer
> > +    /// vector. These nodes are generated from llvm.*absdiff* intrinsics.
> > +    SABSDIFF, UABSDIFF,
> > +
> >     /// Bit counting operators with an undefined result for zero inputs.
> >     CTTZ_ZERO_UNDEF, CTLZ_ZERO_UNDEF,
> >
> >
> > Modified: llvm/trunk/include/llvm/IR/Intrinsics.td
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrins
> > ics.td?rev=242409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/include/llvm/IR/Intrinsics.td (original)
> > +++ llvm/trunk/include/llvm/IR/Intrinsics.td Thu Jul 16 10:22:46 2015
> > @@ -605,6 +605,12 @@ def int_convertuu  : Intrinsic<[llvm_any def
> > int_clear_cache : Intrinsic<[], [llvm_ptr_ty, llvm_ptr_ty],
> >                                 [], "llvm.clear_cache">;
> >
> > +// Calculate the Absolute Differences of the two input vectors.
> > +def int_sabsdiff : Intrinsic<[llvm_anyvector_ty],
> > +                        [ LLVMMatchType<0>, LLVMMatchType<0> ],
> > +[IntrNoMem]>; def int_uabsdiff : Intrinsic<[llvm_anyvector_ty],
> > +                        [ LLVMMatchType<0>, LLVMMatchType<0> ],
> > +[IntrNoMem]>;
> > +
> > //===-------------------------- Masked Intrinsics
> > -------------------------===// // def int_masked_store : Intrinsic<[],
> > [llvm_anyvector_ty, LLVMPointerTo<0>,
> >
> > Modified: llvm/trunk/include/llvm/Target/TargetSelectionDAG.td
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/Tar
> > getSelectionDAG.td?rev=242409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/include/llvm/Target/TargetSelectionDAG.td (original)
> > +++ llvm/trunk/include/llvm/Target/TargetSelectionDAG.td Thu Jul 16
> > +++ 10:22:46 2015
> > @@ -386,6 +386,8 @@ def smax       : SDNode<"ISD::SMAX"
> > def umin       : SDNode<"ISD::UMIN"      , SDTIntBinOp>;
> > def umax       : SDNode<"ISD::UMAX"      , SDTIntBinOp>;
> >
> > +def sabsdiff   : SDNode<"ISD::SABSDIFF"   , SDTIntBinOp>;
> > +def uabsdiff   : SDNode<"ISD::UABSDIFF"   , SDTIntBinOp>;
> > def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;
> > def bswap      : SDNode<"ISD::BSWAP"      , SDTIntUnaryOp>;
> > def ctlz       : SDNode<"ISD::CTLZ"       , SDTIntUnaryOp>;
> >
> > Modified:
> llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDA
> >
> G/LegalizeIntegerTypes.cpp?rev=242409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
> > (original)
> > +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp Thu
> > +++ Jul 16 10:22:46 2015
> > @@ -146,6 +146,10 @@ void DAGTypeLegalizer::PromoteIntegerRes
> >   case ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS:
> >     Res = PromoteIntRes_AtomicCmpSwap(cast<AtomicSDNode>(N),
> ResNo);
> >     break;
> > +  case ISD::UABSDIFF:
> > +  case ISD::SABSDIFF:
> > +    Res = PromoteIntRes_SimpleIntBinOp(N);
> > +    break;
> >   }
> >
> >   // If the result is null then the sub-method took care of registering it.
> >
> > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDA
> > G/LegalizeVectorOps.cpp?rev=242409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
> > (original)
> > +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp Thu Jul
> > +++ 16 10:22:46 2015
> > @@ -105,6 +105,7 @@ class VectorLegalizer {
> >   SDValue ExpandLoad(SDValue Op);
> >   SDValue ExpandStore(SDValue Op);
> >   SDValue ExpandFNEG(SDValue Op);
> > +  SDValue ExpandABSDIFF(SDValue Op);
> >
> >   /// \brief Implements vector promotion.
> >   ///
> > @@ -326,6 +327,8 @@ SDValue VectorLegalizer::LegalizeOp(SDVa
> >   case ISD::SMAX:
> >   case ISD::UMIN:
> >   case ISD::UMAX:
> > +  case ISD::UABSDIFF:
> > +  case ISD::SABSDIFF:
> >     QueryType = Node->getValueType(0);
> >     break;
> >   case ISD::FP_ROUND_INREG:
> > @@ -708,11 +711,36 @@ SDValue VectorLegalizer::Expand(SDValue
> >     return ExpandFNEG(Op);
> >   case ISD::SETCC:
> >     return UnrollVSETCC(Op);
> > +  case ISD::UABSDIFF:
> > +  case ISD::SABSDIFF:
> > +    return ExpandABSDIFF(Op);
> >   default:
> >     return DAG.UnrollVectorOp(Op.getNode());
> >   }
> > }
> >
> > +SDValue VectorLegalizer::ExpandABSDIFF(SDValue Op) {
> > +  SDLoc dl(Op);
> > +  SDValue Tmp1, Tmp2, Tmp3, Tmp4;
> > +  EVT VT = Op.getValueType();
> > +  SDNodeFlags Flags;
> > +  Flags.setNoSignedWrap(Op->getOpcode() == ISD::SABSDIFF);
> > +
> > +  Tmp2 = Op.getOperand(0);
> > +  Tmp3 = Op.getOperand(1);
> > +  Tmp1 = DAG.getNode(ISD::SUB, dl, VT, Tmp2, Tmp3, &Flags);
> > +  Tmp2 =
> > +      DAG.getNode(ISD::SUB, dl, VT, DAG.getConstant(0, dl, VT), Tmp1,
> > +&Flags);
> > +  Tmp4 = DAG.getNode(
> > +      ISD::SETCC, dl,
> > +      TLI.getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(),
> VT), Tmp2,
> > +      DAG.getConstant(0, dl, VT),
> > +      DAG.getCondCode(Op->getOpcode() == ISD::SABSDIFF ? ISD::SETLT
> > +                                                       :
> > +ISD::SETULT));
> > +  Tmp1 = DAG.getNode(ISD::VSELECT, dl, VT, Tmp4, Tmp1, Tmp2);
> > +  return Tmp1;
> > +}
> > +
> > SDValue VectorLegalizer::ExpandSELECT(SDValue Op) {
> >   // Lower a select instruction where the condition is a scalar and the
> >   // operands are vectors. Lower this select to VSELECT and implement
> > it
> >
> > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDA
> > G/LegalizeVectorTypes.cpp?rev=242409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
> > (original)
> > +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp Thu
> > +++ Jul 16 10:22:46 2015
> > @@ -678,6 +678,8 @@ void DAGTypeLegalizer::SplitVectorResult
> >   case ISD::SMAX:
> >   case ISD::UMIN:
> >   case ISD::UMAX:
> > +  case ISD::UABSDIFF:
> > +  case ISD::SABSDIFF:
> >     SplitVecRes_BinOp(N, Lo, Hi);
> >     break;
> >   case ISD::FMA:
> >
> > Modified:
> llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDA
> > G/SelectionDAGBuilder.cpp?rev=242409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
> > (original)
> > +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Thu
> > +++ Jul 16 10:22:46 2015
> > @@ -4646,6 +4646,18 @@ SelectionDAGBuilder::visitIntrinsicCall(
> >                              getValue(I.getArgOperand(0)).getValueType(),
> >                              getValue(I.getArgOperand(0))));
> >     return nullptr;
> > +  case Intrinsic::uabsdiff:
> > +    setValue(&I, DAG.getNode(ISD::UABSDIFF, sdl,
> > +                             getValue(I.getArgOperand(0)).getValueType(),
> > +                             getValue(I.getArgOperand(0)),
> > +                             getValue(I.getArgOperand(1))));
> > +    return nullptr;
> > +  case Intrinsic::sabsdiff:
> > +    setValue(&I, DAG.getNode(ISD::SABSDIFF, sdl,
> > +                             getValue(I.getArgOperand(0)).getValueType(),
> > +                             getValue(I.getArgOperand(0)),
> > +                             getValue(I.getArgOperand(1))));
> > +    return nullptr;
> >   case Intrinsic::cttz: {
> >     SDValue Arg = getValue(I.getArgOperand(0));
> >     ConstantInt *CI = cast<ConstantInt>(I.getArgOperand(1));
> >
> > Modified:
> llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDA
> >
> G/SelectionDAGDumper.cpp?rev=242409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
> > (original)
> > +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp Thu
> Jul
> > +++ 16 10:22:46 2015
> > @@ -225,6 +225,8 @@ std::string SDNode::getOperationName(con
> >   case ISD::SHL_PARTS:                  return "shl_parts";
> >   case ISD::SRA_PARTS:                  return "sra_parts";
> >   case ISD::SRL_PARTS:                  return "srl_parts";
> > +  case ISD::UABSDIFF:                   return "uabsdiff";
> > +  case ISD::SABSDIFF:                   return "sabsdiff";
> >
> >   // Conversion operators.
> >   case ISD::SIGN_EXTEND:                return "sign_extend";
> >
> > Modified: llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetLower
> > ingBase.cpp?rev=242409&r1=242408&r2=242409&view=diff
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp (original)
> > +++ llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp Thu Jul 16 10:22:46
> > +++ 2015
> > @@ -827,6 +827,8 @@ void TargetLoweringBase::initActions() {
> >     setOperationAction(ISD::USUBO, VT, Expand);
> >     setOperationAction(ISD::SMULO, VT, Expand);
> >     setOperationAction(ISD::UMULO, VT, Expand);
> > +    setOperationAction(ISD::UABSDIFF, VT, Expand);
> > +    setOperationAction(ISD::SABSDIFF, VT, Expand);
> >
> >     // These library functions default to expand.
> >     setOperationAction(ISD::FROUND, VT, Expand);
> >
> > Added: llvm/trunk/test/CodeGen/X86/absdiff_expand.ll
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/absdif
> > f_expand.ll?rev=242409&view=auto
> >
> ==========================================================
> ============
> > ========
> > --- llvm/trunk/test/CodeGen/X86/absdiff_expand.ll (added)
> > +++ llvm/trunk/test/CodeGen/X86/absdiff_expand.ll Thu Jul 16 10:22:46
> > +++ 2015
> > @@ -0,0 +1,242 @@
> > +; RUN: llc -mtriple=x86_64-unknown-linux-gnu  < %s | FileCheck %s
> > +-check-prefix=CHECK
> > +
> > +declare <4 x i8> @llvm.uabsdiff.v4i8(<4 x i8>, <4 x i8>)
> > +
> > +define <4 x i8> @test_uabsdiff_v4i8_expand(<4 x i8> %a1, <4 x i8>
> > +%a2) { ; CHECK-LABEL: test_uabsdiff_v4i8_expand
> > +; CHECK:             psubd  %xmm1, %xmm0
> > +; CHECK-NEXT:        pxor   %xmm1, %xmm1
> > +; CHECK-NEXT:        psubd  %xmm0, %xmm1
> > +; CHECK-NEXT:        movdqa  .LCPI{{[0-9_]*}}
> > +; CHECK-NEXT:        movdqa  %xmm1, %xmm3
> > +; CHECK-NEXT:        pxor   %xmm2, %xmm3
> > +; CHECK-NEXT:        pcmpgtd        %xmm3, %xmm2
> > +; CHECK-NEXT:        pand    %xmm2, %xmm0
> > +; CHECK-NEXT:        pandn   %xmm1, %xmm2
> > +; CHECK-NEXT:        por     %xmm2, %xmm0
> > +; CHECK-NEXT:        retq
> > +
> > +  %1 = call <4 x i8> @llvm.uabsdiff.v4i8(<4 x i8> %a1, <4 x i8> %a2)
> > +  ret <4 x i8> %1
> > +}
> > +
> > +declare <4 x i8> @llvm.sabsdiff.v4i8(<4 x i8>, <4 x i8>)
> > +
> > +define <4 x i8> @test_sabsdiff_v4i8_expand(<4 x i8> %a1, <4 x i8>
> > +%a2) { ; CHECK-LABEL: test_sabsdiff_v4i8_expand
> > +; CHECK:      psubd  %xmm1, %xmm0
> > +; CHECK-NEXT: pxor   %xmm1, %xmm1
> > +; CHECK-NEXT: pxor    %xmm2, %xmm2
> > +; CHECK-NEXT: psubd  %xmm0, %xmm2
> > +; CHECK-NEXT: pcmpgtd  %xmm2, %xmm1
> > +; CHECK-NEXT: pand    %xmm1, %xmm0
> > +; CHECK-NEXT: pandn   %xmm2, %xmm1
> > +; CHECK-NEXT: por     %xmm1, %xmm0
> > +; CHECK-NEXT: retq
> > +
> > +  %1 = call <4 x i8> @llvm.sabsdiff.v4i8(<4 x i8> %a1, <4 x i8> %a2)
> > +  ret <4 x i8> %1
> > +}
> > +
> > +
> > +declare <8 x i8> @llvm.sabsdiff.v8i8(<8 x i8>, <8 x i8>)
> > +
> > +define <8 x i8> @test_sabsdiff_v8i8_expand(<8 x i8> %a1, <8 x i8>
> > +%a2) { ; CHECK-LABEL: test_sabsdiff_v8i8_expand
> > +; CHECK:      psubw  %xmm1, %xmm0
> > +; CHECK-NEXT: pxor   %xmm1, %xmm1
> > +; CHECK-NEXT: pxor   %xmm2, %xmm2
> > +; CHECK-NEXT: psubw  %xmm0, %xmm2
> > +; CHECK-NEXT: pcmpgtw        %xmm2, %xmm1
> > +; CHECK-NEXT: pand  %xmm1, %xmm0
> > +; CHECK-NEXT: pandn %xmm2, %xmm1
> > +; CHECK-NEXT: por  %xmm1, %xmm0
> > +; CHECK-NEXT: retq
> > +  %1 = call <8 x i8> @llvm.sabsdiff.v8i8(<8 x i8> %a1, <8 x i8> %a2)
> > +  ret <8 x i8> %1
> > +}
> > +
> > +declare <16 x i8> @llvm.uabsdiff.v16i8(<16 x i8>, <16 x i8>)
> > +
> > +define <16 x i8> @test_uabsdiff_v16i8_expand(<16 x i8> %a1, <16 x i8>
> > +%a2) { ; CHECK-LABEL: test_uabsdiff_v16i8_expand
> > +; CHECK:             psubb  %xmm1, %xmm0
> > +; CHECK-NEXT:        pxor   %xmm1, %xmm1
> > +; CHECK-NEXT:        psubb  %xmm0, %xmm1
> > +; CHECK-NEXT:        movdqa  .LCPI{{[0-9_]*}}
> > +; CHECK-NEXT:        movdqa  %xmm1, %xmm3
> > +; CHECK-NEXT:        pxor   %xmm2, %xmm3
> > +; CHECK-NEXT:        pcmpgtb        %xmm3, %xmm2
> > +; CHECK-NEXT:        pand    %xmm2, %xmm0
> > +; CHECK-NEXT:        pandn   %xmm1, %xmm2
> > +; CHECK-NEXT:        por     %xmm2, %xmm0
> > +; CHECK-NEXT:        retq
> > +  %1 = call <16 x i8> @llvm.uabsdiff.v16i8(<16 x i8> %a1, <16 x i8>
> > +%a2)
> > +  ret <16 x i8> %1
> > +}
> > +
> > +declare <8 x i16> @llvm.uabsdiff.v8i16(<8 x i16>, <8 x i16>)
> > +
> > +define <8 x i16> @test_uabsdiff_v8i16_expand(<8 x i16> %a1, <8 x i16>
> > +%a2) { ; CHECK-LABEL: test_uabsdiff_v8i16_expand
> > +; CHECK:             psubw  %xmm1, %xmm0
> > +; CHECK-NEXT:        pxor   %xmm1, %xmm1
> > +; CHECK-NEXT:        psubw  %xmm0, %xmm1
> > +; CHECK-NEXT:        movdqa  .LCPI{{[0-9_]*}}
> > +; CHECK-NEXT:        movdqa  %xmm1, %xmm3
> > +; CHECK-NEXT:        pxor   %xmm2, %xmm3
> > +; CHECK-NEXT:        pcmpgtw        %xmm3, %xmm2
> > +; CHECK-NEXT:        pand    %xmm2, %xmm0
> > +; CHECK-NEXT:        pandn   %xmm1, %xmm2
> > +; CHECK-NEXT:        por     %xmm2, %xmm0
> > +; CHECK-NEXT:        retq
> > +  %1 = call <8 x i16> @llvm.uabsdiff.v8i16(<8 x i16> %a1, <8 x i16>
> > +%a2)
> > +  ret <8 x i16> %1
> > +}
> > +
> > +declare <8 x i16> @llvm.sabsdiff.v8i16(<8 x i16>, <8 x i16>)
> > +
> > +define <8 x i16> @test_sabsdiff_v8i16_expand(<8 x i16> %a1, <8 x i16>
> > +%a2) { ; CHECK-LABEL: test_sabsdiff_v8i16_expand
> > +; CHECK:      psubw  %xmm1, %xmm0
> > +; CHECK-NEXT: pxor   %xmm1, %xmm1
> > +; CHECK-NEXT: pxor   %xmm2, %xmm2
> > +; CHECK-NEXT: psubw  %xmm0, %xmm2
> > +; CHECK-NEXT: pcmpgtw        %xmm2, %xmm1
> > +; CHECK-NEXT: pand  %xmm1, %xmm0
> > +; CHECK-NEXT: pandn %xmm2, %xmm1
> > +; CHECK-NEXT: por  %xmm1, %xmm0
> > +; CHECK-NEXT: retq
> > +  %1 = call <8 x i16> @llvm.sabsdiff.v8i16(<8 x i16> %a1, <8 x i16>
> > +%a2)
> > +  ret <8 x i16> %1
> > +}
> > +
> > +declare <4 x i32> @llvm.sabsdiff.v4i32(<4 x i32>, <4 x i32>)
> > +
> > +define <4 x i32> @test_sabsdiff_v4i32_expand(<4 x i32> %a1, <4 x i32>
> > +%a2) { ; CHECK-LABEL: test_sabsdiff_v4i32_expand
> > +; CHECK:             psubd  %xmm1, %xmm0
> > +; CHECK-NEXT:        pxor  %xmm1, %xmm1
> > +; CHECK-NEXT:        pxor  %xmm2, %xmm2
> > +; CHECK-NEXT:        psubd  %xmm0, %xmm2
> > +; CHECK-NEXT:        pcmpgtd        %xmm2, %xmm1
> > +; CHECK-NEXT:        pand    %xmm1, %xmm0
> > +; CHECK-NEXT:        pandn   %xmm2, %xmm1
> > +; CHECK-NEXT:        por    %xmm1, %xmm0
> > +; CHECK-NEXT:        retq
> > +  %1 = call <4 x i32> @llvm.sabsdiff.v4i32(<4 x i32> %a1, <4 x i32>
> > +%a2)
> > +  ret <4 x i32> %1
> > +}
> > +
> > +declare <4 x i32> @llvm.uabsdiff.v4i32(<4 x i32>, <4 x i32>)
> > +
> > +define <4 x i32> @test_uabsdiff_v4i32_expand(<4 x i32> %a1, <4 x i32>
> > +%a2) { ; CHECK-LABEL: test_uabsdiff_v4i32_expand
> > +; CHECK:             psubd  %xmm1, %xmm0
> > +; CHECK-NEXT:        pxor   %xmm1, %xmm1
> > +; CHECK-NEXT:        psubd  %xmm0, %xmm1
> > +; CHECK-NEXT:        movdqa  .LCPI{{[0-9_]*}}
> > +; CHECK-NEXT:        movdqa  %xmm1, %xmm3
> > +; CHECK-NEXT:        pxor   %xmm2, %xmm3
> > +; CHECK-NEXT:        pcmpgtd        %xmm3, %xmm2
> > +; CHECK-NEXT:        pand    %xmm2, %xmm0
> > +; CHECK-NEXT:        pandn   %xmm1, %xmm2
> > +; CHECK-NEXT:        por     %xmm2, %xmm0
> > +; CHECK-NEXT:        retq
> > +  %1 = call <4 x i32> @llvm.uabsdiff.v4i32(<4 x i32> %a1, <4 x i32>
> > +%a2)
> > +  ret <4 x i32> %1
> > +}
> > +
> > +declare <2 x i32> @llvm.sabsdiff.v2i32(<2 x i32>, <2 x i32>)
> > +
> > +define <2 x i32> @test_sabsdiff_v2i32_expand(<2 x i32> %a1, <2 x i32>
> > +%a2) { ; CHECK-LABEL: test_sabsdiff_v2i32_expand
> > +; CHECK:        psubq   %xmm1, %xmm0
> > +; CHECK-NEXT:   pxor    %xmm1, %xmm1
> > +; CHECK-NEXT:   psubq   %xmm0, %xmm1
> > +; CHECK-NEXT:   movdqa  .LCPI{{[0-9_]*}}
> > +; CHECK-NEXT:   movdqa  %xmm1, %xmm3
> > +; CHECK-NEXT:   pxor    %xmm2, %xmm3
> > +; CHECK-NEXT:   movdqa  %xmm2, %xmm4
> > +; CHECK-NEXT:   pcmpgtd %xmm3, %xmm4
> > +; CHECK-NEXT:   pshufd  $160, %xmm4, %xmm5      # xmm5 =
> xmm4[0,0,2,2]
> > +; CHECK-NEXT:   pcmpeqd %xmm2, %xmm3
> > +; CHECK-NEXT:   pshufd  $245, %xmm3, %xmm2      # xmm2 =
> xmm3[1,1,3,3]
> > +; CHECK-NEXT:   pand    %xmm5, %xmm2
> > +; CHECK-NEXT:   pshufd  $245, %xmm4, %xmm3      # xmm3 =
> xmm4[1,1,3,3]
> > +; CHECK-NEXT:   por     %xmm2, %xmm3
> > +; CHECK-NEXT:   pand    %xmm3, %xmm0
> > +; CHECK-NEXT:   pandn   %xmm1, %xmm3
> > +; CHECK-NEXT:   por     %xmm3, %xmm0
> > +; CHECK-NEXT:   retq
> > +  %1 = call <2 x i32> @llvm.sabsdiff.v2i32(<2 x i32> %a1, <2 x i32>
> > +%a2)
> > +  ret <2 x i32> %1
> > +}
> > +
> > +declare <2 x i64> @llvm.sabsdiff.v2i64(<2 x i64>, <2 x i64>)
> > +
> > +define <2 x i64> @test_sabsdiff_v2i64_expand(<2 x i64> %a1, <2 x i64>
> > +%a2) { ; CHECK-LABEL: test_sabsdiff_v2i64_expand
> > +; CHECK:        psubq   %xmm1, %xmm0
> > +; CHECK-NEXT:   pxor    %xmm1, %xmm1
> > +; CHECK-NEXT:   psubq   %xmm0, %xmm1
> > +; CHECK-NEXT:   movdqa  .LCPI{{[0-9_]*}}
> > +; CHECK-NEXT:   movdqa  %xmm1, %xmm3
> > +; CHECK-NEXT:   pxor    %xmm2, %xmm3
> > +; CHECK-NEXT:   movdqa  %xmm2, %xmm4
> > +; CHECK-NEXT:   pcmpgtd %xmm3, %xmm4
> > +; CHECK-NEXT:   pshufd  $160, %xmm4, %xmm5      # xmm5 =
> xmm4[0,0,2,2]
> > +; CHECK-NEXT:   pcmpeqd %xmm2, %xmm3
> > +; CHECK-NEXT:   pshufd  $245, %xmm3, %xmm2      # xmm2 =
> xmm3[1,1,3,3]
> > +; CHECK-NEXT:   pand    %xmm5, %xmm2
> > +; CHECK-NEXT:   pshufd  $245, %xmm4, %xmm3      # xmm3 =
> xmm4[1,1,3,3]
> > +; CHECK-NEXT:   por     %xmm2, %xmm3
> > +; CHECK-NEXT:   pand    %xmm3, %xmm0
> > +; CHECK-NEXT:   pandn   %xmm1, %xmm3
> > +; CHECK-NEXT:   por     %xmm3, %xmm0
> > +; CHECK-NEXT:   retq
> > +  %1 = call <2 x i64> @llvm.sabsdiff.v2i64(<2 x i64> %a1, <2 x i64>
> > +%a2)
> > +  ret <2 x i64> %1
> > +}
> > +
> > +declare <16 x i32> @llvm.sabsdiff.v16i32(<16 x i32>, <16 x i32>)
> > +
> > +define <16 x i32> @test_sabsdiff_v16i32_expand(<16 x i32> %a1, <16 x
> > +i32> %a2) { ; CHECK-LABEL: test_sabsdiff_v16i32_expand
> > +; CHECK:             psubd  %xmm4, %xmm0
> > +; CHECK-NEXT:        pxor    %xmm8, %xmm8
> > +; CHECK-NEXT:        pxor    %xmm9, %xmm9
> > +; CHECK-NEXT:        psubd   %xmm0, %xmm9
> > +; CHECK-NEXT:        pxor    %xmm4, %xmm4
> > +; CHECK-NEXT:        pcmpgtd %xmm9, %xmm4
> > +; CHECK-NEXT:        pand    %xmm4, %xmm0
> > +; CHECK-NEXT:        pandn   %xmm9, %xmm4
> > +; CHECK-NEXT:        por     %xmm4, %xmm0
> > +; CHECK-NEXT:        psubd   %xmm5, %xmm1
> > +; CHECK-NEXT:        pxor    %xmm4, %xmm4
> > +; CHECK-NEXT:        psubd   %xmm1, %xmm4
> > +; CHECK-NEXT:        pxor    %xmm5, %xmm5
> > +; CHECK-NEXT:        pcmpgtd %xmm4, %xmm5
> > +; CHECK-NEXT:        pand    %xmm5, %xmm1
> > +; CHECK-NEXT:        pandn   %xmm4, %xmm5
> > +; CHECK-NEXT:        por     %xmm5, %xmm1
> > +; CHECK-NEXT:        psubd   %xmm6, %xmm2
> > +; CHECK-NEXT:        pxor    %xmm4, %xmm4
> > +; CHECK-NEXT:        psubd   %xmm2, %xmm4
> > +; CHECK-NEXT:        pxor    %xmm5, %xmm5
> > +; CHECK-NEXT:        pcmpgtd %xmm4, %xmm5
> > +; CHECK-NEXT:        pand    %xmm5, %xmm2
> > +; CHECK-NEXT:        pandn   %xmm4, %xmm5
> > +; CHECK-NEXT:        por     %xmm5, %xmm2
> > +; CHECK-NEXT:        psubd   %xmm7, %xmm3
> > +; CHECK-NEXT:        pxor    %xmm4, %xmm4
> > +; CHECK-NEXT:        psubd   %xmm3, %xmm4
> > +; CHECK-NEXT:        pcmpgtd %xmm4, %xmm8
> > +; CHECK-NEXT:        pand    %xmm8, %xmm3
> > +; CHECK-NEXT:        pandn   %xmm4, %xmm8
> > +; CHECK-NEXT:        por     %xmm8, %xmm3
> > +; CHECK-NEXT:        req
> The tests look very fragile, should we make them more relaxed in terms of
> register names?
Do you mean to use regular expression for register names?

> > +  %1 = call <16 x i32> @llvm.sabsdiff.v16i32(<16 x i32> %a1, <16 x
> > +i32> %a2)
> > +  ret <16 x i32> %1
> > +}
> > +
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits