[llvm] r191286 - [mips][msa] Added support for matching comparisons from normal IR (i.e. not intrinsics)

Aaron Ballman aaron at aaronballman.com
Mon Oct 7 09:46:28 PDT 2013


These tests are failing for the build bot, not for me.  So they've got
all the default options for cmake.  As for the cl version, it's both
MSVC 10 and MSVC 11 that are failing, but perhaps Takumi can give
further information.

In my tests, I'm on Win7 x64, building a 32-bit clang with MSVC 11
without ninja and not seeing the failure.

~Aaron

On Mon, Oct 7, 2013 at 12:43 PM, Daniel Sanders
<Daniel.Sanders at imgtec.com> wrote:
> I haven't run the full 'ninja check' yet, but those three tests pass in my MSVC build (using cmake and ninja) of r191286.
>
> Which compiler versions are you using? I'm using cl.exe version 16.00.30319.01. Also, have you set any non-default options in CMakeCache.txt?
>
>> -----Original Message-----
>> From: Daniel Sanders
>> Sent: 07 October 2013 14:27
>> To: 'Aaron Ballman'
>> Cc: llvm-commits
>> Subject: RE: [llvm] r191286 - [mips][msa] Added support for matching
>> comparisons from normal IR (i.e. not intrinsics)
>>
>> I rely on the buildbot at lab.llvm.org to inform me that my changes have
>> broken builds. This failure has come from outside that buildbot so I
>> wouldn't have known something had broken. Given that the lab.llvm.org
>> builds were all ok, I don't think reverting this change (and all my subsequent
>> changes) is a reasonable course of action.
>>
>> I'm setting up a MSVC host as we speak. Hopefully I should have some idea
>> what's going on soon. I notice that Matt Arsenault has submitted a patch
>> that might be related to the solution. His patch works around an MSVC
>> miscompilation that is likely to affect
>> CodeGen__Mips__msa__compare_float.ll.
>>
>> > -----Original Message-----
>> > From: aaron.ballman at gmail.com [mailto:aaron.ballman at gmail.com] On
>> > Behalf Of Aaron Ballman
>> > Sent: 07 October 2013 13:52
>> > To: Daniel Sanders
>> > Cc: llvm-commits
>> > Subject: Re: [llvm] r191286 - [mips][msa] Added support for matching
>> > comparisons from normal IR (i.e. not intrinsics)
>> >
>> > On Mon, Oct 7, 2013 at 5:11 AM, Daniel Sanders
>> > <Daniel.Sanders at imgtec.com> wrote:
>> > > Hi,
>> > >
>> > > Thanks for the report. I'll look into this but it might take a while to fix it
>> > since I'll have to set up a suitable development environment to reproduce
>> > the problem.
>> >
>> > That's a bit worrisome since the build has been broken for two weeks
>> > now.  Can you please revert your changes (and r191290) until you are
>> > able to reproduce and resolve the problem?
>> >
>> > ~Aaron
>> >
>> > >
>> > >> -----Original Message-----
>> > >> From: aaron.ballman at gmail.com [mailto:aaron.ballman at gmail.com]
>> On
>> > >> Behalf Of Aaron Ballman
>> > >> Sent: 06 October 2013 02:02
>> > >> To: Daniel Sanders
>> > >> Cc: llvm-commits
>> > >> Subject: Re: [llvm] r191286 - [mips][msa] Added support for matching
>> > >> comparisons from normal IR (i.e. not intrinsics)
>> > >>
>> > >> The MSVC builders have been broken since this commit. Specifically:
>> > >>
>> > >> LLVM :: CodeGen__Mips__msa__3rf_int_float.ll
>> > >> LLVM :: CodeGen__Mips__msa__compare_float.ll
>> > >>
>> > >> http://bb.pgr.jp/builders/ninja-clang-i686-msc17-R/builds/5000
>> > >>
>> > >> r191290 caused LLVM :: CodeGen__Mips__msa__compare.ll to fail as
>> well.
>> > >>
>> > >> This is failing on MSVC 10 and MSVC 11.
>> > >>
>> > >> ~Aaron
>> > >>
>> > >> On Tue, Sep 24, 2013 at 6:46 AM, Daniel Sanders
>> > >> <daniel.sanders at imgtec.com> wrote:
>> > >> > Author: dsanders
>> > >> > Date: Tue Sep 24 05:46:19 2013
>> > >> > New Revision: 191286
>> > >> >
>> > >> > URL: http://llvm.org/viewvc/llvm-project?rev=191286&view=rev
>> > >> > Log:
>> > >> > [mips][msa] Added support for matching comparisons from normal IR
>> > (i.e.
>> > >> not intrinsics)
>> > >> >
>> > >> > MIPS SelectionDAG changes:
>> > >> > * Added VCEQ, VCL[ET]_[SU] nodes to represent vector comparisons
>> > that
>> > >> produce a bitmask.
>> > >> >
>> > >> > Added:
>> > >> >     llvm/trunk/test/CodeGen/Mips/msa/compare.ll
>> > >> >     llvm/trunk/test/CodeGen/Mips/msa/compare_float.ll
>> > >> > Modified:
>> > >> >     llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp
>> > >> >     llvm/trunk/lib/Target/Mips/MipsISelLowering.h
>> > >> >     llvm/trunk/lib/Target/Mips/MipsMSAInstrInfo.td
>> > >> >     llvm/trunk/lib/Target/Mips/MipsSEISelLowering.cpp
>> > >> >
>> > >> > Modified: llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp
>> > >> > URL: http://llvm.org/viewvc/llvm-
>> > >>
>> >
>> project/llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp?rev=191286&r1=
>> > 1
>> > >> 91285&r2=191286&view=diff
>> > >> >
>> > >> ==========================================================
>> > >> ====================
>> > >> > --- llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp (original)
>> > >> > +++ llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp Tue Sep 24
>> > 05:46:19
>> > >> 2013
>> > >> > @@ -212,6 +212,11 @@ const char *MipsTargetLowering::getTarge
>> > >> >    case MipsISD::VANY_ZERO:         return "MipsISD::VANY_ZERO";
>> > >> >    case MipsISD::VALL_NONZERO:      return "MipsISD::VALL_NONZERO";
>> > >> >    case MipsISD::VANY_NONZERO:      return
>> "MipsISD::VANY_NONZERO";
>> > >> > +  case MipsISD::VCEQ:              return "MipsISD::VCEQ";
>> > >> > +  case MipsISD::VCLE_S:            return "MipsISD::VCLE_S";
>> > >> > +  case MipsISD::VCLE_U:            return "MipsISD::VCLE_U";
>> > >> > +  case MipsISD::VCLT_S:            return "MipsISD::VCLT_S";
>> > >> > +  case MipsISD::VCLT_U:            return "MipsISD::VCLT_U";
>> > >> >    case MipsISD::VSPLAT:            return "MipsISD::VSPLAT";
>> > >> >    case MipsISD::VSPLATD:           return "MipsISD::VSPLATD";
>> > >> >    case MipsISD::VEXTRACT_SEXT_ELT: return
>> > >> "MipsISD::VEXTRACT_SEXT_ELT";
>> > >> >
>> > >> > Modified: llvm/trunk/lib/Target/Mips/MipsISelLowering.h
>> > >> > URL: http://llvm.org/viewvc/llvm-
>> > >>
>> >
>> project/llvm/trunk/lib/Target/Mips/MipsISelLowering.h?rev=191286&r1=19
>> > 1
>> > >> 285&r2=191286&view=diff
>> > >> >
>> > >> ==========================================================
>> > >> ====================
>> > >> > --- llvm/trunk/lib/Target/Mips/MipsISelLowering.h (original)
>> > >> > +++ llvm/trunk/lib/Target/Mips/MipsISelLowering.h Tue Sep 24
>> 05:46:19
>> > >> 2013
>> > >> > @@ -153,11 +153,19 @@ namespace llvm {
>> > >> >        SELECT_CC_DSP,
>> > >> >
>> > >> >        // Vector comparisons.
>> > >> > +      // These take a vector and return a boolean.
>> > >> >        VALL_ZERO,
>> > >> >        VANY_ZERO,
>> > >> >        VALL_NONZERO,
>> > >> >        VANY_NONZERO,
>> > >> >
>> > >> > +      // These take a vector and return a vector bitmask.
>> > >> > +      VCEQ,
>> > >> > +      VCLE_S,
>> > >> > +      VCLE_U,
>> > >> > +      VCLT_S,
>> > >> > +      VCLT_U,
>> > >> > +
>> > >> >        // Special case of BUILD_VECTOR where all elements are the same.
>> > >> >        VSPLAT,
>> > >> >        // Special case of VSPLAT where the result is v2i64, the operand is
>> > >> >
>> > >> > Modified: llvm/trunk/lib/Target/Mips/MipsMSAInstrInfo.td
>> > >> > URL: http://llvm.org/viewvc/llvm-
>> > >>
>> >
>> project/llvm/trunk/lib/Target/Mips/MipsMSAInstrInfo.td?rev=191286&r1=1
>> > >> 91285&r2=191286&view=diff
>> > >> >
>> > >> ==========================================================
>> > >> ====================
>> > >> > --- llvm/trunk/lib/Target/Mips/MipsMSAInstrInfo.td (original)
>> > >> > +++ llvm/trunk/lib/Target/Mips/MipsMSAInstrInfo.td Tue Sep 24
>> > 05:46:19
>> > >> 2013
>> > >> > @@ -13,6 +13,14 @@
>> > >> >
>> > >> >  def SDT_MipsSplat : SDTypeProfile<1, 1, [SDTCisVec<0>,
>> SDTCisInt<1>]>;
>> > >> >  def SDT_MipsVecCond : SDTypeProfile<1, 1, [SDTCisInt<0>,
>> > >> SDTCisVec<1>]>;
>> > >> > +def SDT_VSetCC : SDTypeProfile<1, 3, [SDTCisInt<0>,
>> > >> > +                                      SDTCisInt<1>,
>> > >> > +                                      SDTCisSameAs<1, 2>,
>> > >> > +                                      SDTCisVT<3, OtherVT>]>;
>> > >> > +def SDT_VFSetCC : SDTypeProfile<1, 3, [SDTCisInt<0>,
>> > >> > +                                       SDTCisFP<1>,
>> > >> > +                                       SDTCisSameAs<1, 2>,
>> > >> > +                                       SDTCisVT<3, OtherVT>]>;
>> > >> >
>> > >> >  def MipsVAllNonZero : SDNode<"MipsISD::VALL_NONZERO",
>> > >> SDT_MipsVecCond>;
>> > >> >  def MipsVAnyNonZero : SDNode<"MipsISD::VANY_NONZERO",
>> > >> SDT_MipsVecCond>;
>> > >> > @@ -23,6 +31,9 @@ def MipsVSplatD : SDNode<"MipsISD::VSPLA
>> > >> >  def MipsVNOR : SDNode<"MipsISD::VNOR", SDTIntBinOp,
>> > >> >                        [SDNPCommutative, SDNPAssociative]>;
>> > >> >
>> > >> > +def vsetcc : SDNode<"ISD::SETCC", SDT_VSetCC>;
>> > >> > +def vfsetcc : SDNode<"ISD::SETCC", SDT_VFSetCC>;
>> > >> > +
>> > >> >  def MipsVExtractSExt : SDNode<"MipsISD::VEXTRACT_SEXT_ELT",
>> > >> >      SDTypeProfile<1, 3, [SDTCisPtrTy<2>]>, []>;
>> > >> >  def MipsVExtractZExt : SDNode<"MipsISD::VEXTRACT_ZEXT_ELT",
>> > >> > @@ -50,6 +61,68 @@ def vinsert_v8i16 : PatFrag<(ops node:$v
>> > >> >  def vinsert_v4i32 : PatFrag<(ops node:$vec, node:$val, node:$idx),
>> > >> >      (v4i32 (vector_insert node:$vec, node:$val, node:$idx))>;
>> > >> >
>> > >> > +class vfsetcc_type<ValueType ResTy, ValueType OpTy, CondCode
>> CC> :
>> > >> > +  PatFrag<(ops node:$lhs, node:$rhs),
>> > >> > +          (ResTy (vfsetcc (OpTy node:$lhs), (OpTy node:$rhs), CC))>;
>> > >> > +
>> > >> > +// ISD::SETFALSE cannot occur
>> > >> > +def vfsetoeq_v4f32 : vfsetcc_type<v4i32, v4f32, SETOEQ>;
>> > >> > +def vfsetoeq_v2f64 : vfsetcc_type<v2i64, v2f64, SETOEQ>;
>> > >> > +def vfsetoge_v4f32 : vfsetcc_type<v4i32, v4f32, SETOGE>;
>> > >> > +def vfsetoge_v2f64 : vfsetcc_type<v2i64, v2f64, SETOGE>;
>> > >> > +def vfsetogt_v4f32 : vfsetcc_type<v4i32, v4f32, SETOGT>;
>> > >> > +def vfsetogt_v2f64 : vfsetcc_type<v2i64, v2f64, SETOGT>;
>> > >> > +def vfsetole_v4f32 : vfsetcc_type<v4i32, v4f32, SETOLE>;
>> > >> > +def vfsetole_v2f64 : vfsetcc_type<v2i64, v2f64, SETOLE>;
>> > >> > +def vfsetolt_v4f32 : vfsetcc_type<v4i32, v4f32, SETOLT>;
>> > >> > +def vfsetolt_v2f64 : vfsetcc_type<v2i64, v2f64, SETOLT>;
>> > >> > +def vfsetone_v4f32 : vfsetcc_type<v4i32, v4f32, SETONE>;
>> > >> > +def vfsetone_v2f64 : vfsetcc_type<v2i64, v2f64, SETONE>;
>> > >> > +def vfsetord_v4f32 : vfsetcc_type<v4i32, v4f32, SETO>;
>> > >> > +def vfsetord_v2f64 : vfsetcc_type<v2i64, v2f64, SETO>;
>> > >> > +def vfsetun_v4f32  : vfsetcc_type<v4i32, v4f32, SETUO>;
>> > >> > +def vfsetun_v2f64  : vfsetcc_type<v2i64, v2f64, SETUO>;
>> > >> > +def vfsetueq_v4f32 : vfsetcc_type<v4i32, v4f32, SETUEQ>;
>> > >> > +def vfsetueq_v2f64 : vfsetcc_type<v2i64, v2f64, SETUEQ>;
>> > >> > +def vfsetuge_v4f32 : vfsetcc_type<v4i32, v4f32, SETUGE>;
>> > >> > +def vfsetuge_v2f64 : vfsetcc_type<v2i64, v2f64, SETUGE>;
>> > >> > +def vfsetugt_v4f32 : vfsetcc_type<v4i32, v4f32, SETUGT>;
>> > >> > +def vfsetugt_v2f64 : vfsetcc_type<v2i64, v2f64, SETUGT>;
>> > >> > +def vfsetule_v4f32 : vfsetcc_type<v4i32, v4f32, SETULE>;
>> > >> > +def vfsetule_v2f64 : vfsetcc_type<v2i64, v2f64, SETULE>;
>> > >> > +def vfsetult_v4f32 : vfsetcc_type<v4i32, v4f32, SETULT>;
>> > >> > +def vfsetult_v2f64 : vfsetcc_type<v2i64, v2f64, SETULT>;
>> > >> > +def vfsetune_v4f32 : vfsetcc_type<v4i32, v4f32, SETUNE>;
>> > >> > +def vfsetune_v2f64 : vfsetcc_type<v2i64, v2f64, SETUNE>;
>> > >> > +// ISD::SETTRUE cannot occur
>> > >> > +// ISD::SETFALSE2 cannot occur
>> > >> > +// ISD::SETTRUE2 cannot occur
>> > >> > +
>> > >> > +class vsetcc_type<ValueType ResTy, CondCode CC> :
>> > >> > +  PatFrag<(ops node:$lhs, node:$rhs),
>> > >> > +          (ResTy (vsetcc node:$lhs, node:$rhs, CC))>;
>> > >> > +
>> > >> > +def vseteq_v16i8  : vsetcc_type<v16i8, SETEQ>;
>> > >> > +def vseteq_v8i16  : vsetcc_type<v8i16, SETEQ>;
>> > >> > +def vseteq_v4i32  : vsetcc_type<v4i32, SETEQ>;
>> > >> > +def vseteq_v2i64  : vsetcc_type<v2i64, SETEQ>;
>> > >> > +def vsetle_v16i8  : vsetcc_type<v16i8, SETLE>;
>> > >> > +def vsetle_v8i16  : vsetcc_type<v8i16, SETLE>;
>> > >> > +def vsetle_v4i32  : vsetcc_type<v4i32, SETLE>;
>> > >> > +def vsetle_v2i64  : vsetcc_type<v2i64, SETLE>;
>> > >> > +def vsetlt_v16i8  : vsetcc_type<v16i8, SETLT>;
>> > >> > +def vsetlt_v8i16  : vsetcc_type<v8i16, SETLT>;
>> > >> > +def vsetlt_v4i32  : vsetcc_type<v4i32, SETLT>;
>> > >> > +def vsetlt_v2i64  : vsetcc_type<v2i64, SETLT>;
>> > >> > +def vsetule_v16i8 : vsetcc_type<v16i8, SETULE>;
>> > >> > +def vsetule_v8i16 : vsetcc_type<v8i16, SETULE>;
>> > >> > +def vsetule_v4i32 : vsetcc_type<v4i32, SETULE>;
>> > >> > +def vsetule_v2i64 : vsetcc_type<v2i64, SETULE>;
>> > >> > +def vsetult_v16i8 : vsetcc_type<v16i8, SETULT>;
>> > >> > +def vsetult_v8i16 : vsetcc_type<v8i16, SETULT>;
>> > >> > +def vsetult_v4i32 : vsetcc_type<v4i32, SETULT>;
>> > >> > +def vsetult_v2i64 : vsetcc_type<v2i64, SETULT>;
>> > >> > +
>> > >> >  def vsplati8  : PatFrag<(ops node:$in), (v16i8 (MipsVSplat (i32
>> > node:$in)))>;
>> > >> >  def vsplati16 : PatFrag<(ops node:$in), (v8i16 (MipsVSplat (i32
>> > >> node:$in)))>;
>> > >> >  def vsplati32 : PatFrag<(ops node:$in), (v4i32 (MipsVSplat (i32
>> > >> node:$in)))>;
>> > >> > @@ -909,12 +982,14 @@ class MSA_I5_X_DESC_BASE<string instr_as
>> > >> >  }
>> > >> >
>> > >> >  class MSA_SI5_DESC_BASE<string instr_asm, SDPatternOperator
>> > OpNode,
>> > >> > -                       RegisterClass RCWD, RegisterClass RCWS = RCWD,
>> > >> > +                       SDPatternOperator SplatNode, RegisterClass RCWD,
>> > >> > +                       RegisterClass RCWS = RCWD,
>> > >> >                         InstrItinClass itin = NoItinerary> {
>> > >> >    dag OutOperandList = (outs RCWD:$wd);
>> > >> >    dag InOperandList = (ins RCWS:$ws, simm5:$s5);
>> > >> >    string AsmString = !strconcat(instr_asm, "\t$wd, $ws, $s5");
>> > >> > -  list<dag> Pattern = [(set RCWD:$wd, (OpNode RCWS:$ws,
>> > >> immSExt5:$s5))];
>> > >> > +  list<dag> Pattern = [(set RCWD:$wd, (OpNode RCWS:$ws,
>> > >> > +                                              (SplatNode immSExt5:$s5)))];
>> > >> >    InstrItinClass Itinerary = itin;
>> > >> >  }
>> > >> >
>> > >> > @@ -1228,19 +1303,23 @@ class BZ_D_DESC :
>> > >> MSA_CBRANCH_DESC_BASE<
>> > >> >
>> > >> >  class BZ_V_DESC : MSA_CBRANCH_DESC_BASE<"bz.v", MSA128B>;
>> > >> >
>> > >> > -class CEQ_B_DESC : MSA_3R_DESC_BASE<"ceq.b", int_mips_ceq_b,
>> > >> MSA128B>,
>> > >> > +class CEQ_B_DESC : MSA_3R_DESC_BASE<"ceq.b", vseteq_v16i8,
>> > >> MSA128B>,
>> > >> >                     IsCommutable;
>> > >> > -class CEQ_H_DESC : MSA_3R_DESC_BASE<"ceq.h", int_mips_ceq_h,
>> > >> MSA128H>,
>> > >> > +class CEQ_H_DESC : MSA_3R_DESC_BASE<"ceq.h", vseteq_v8i16,
>> > >> MSA128H>,
>> > >> >                     IsCommutable;
>> > >> > -class CEQ_W_DESC : MSA_3R_DESC_BASE<"ceq.w", int_mips_ceq_w,
>> > >> MSA128W>,
>> > >> > +class CEQ_W_DESC : MSA_3R_DESC_BASE<"ceq.w", vseteq_v4i32,
>> > >> MSA128W>,
>> > >> >                     IsCommutable;
>> > >> > -class CEQ_D_DESC : MSA_3R_DESC_BASE<"ceq.d", int_mips_ceq_d,
>> > >> MSA128D>,
>> > >> > +class CEQ_D_DESC : MSA_3R_DESC_BASE<"ceq.d", vseteq_v2i64,
>> > >> MSA128D>,
>> > >> >                     IsCommutable;
>> > >> >
>> > >> > -class CEQI_B_DESC : MSA_SI5_DESC_BASE<"ceqi.b", int_mips_ceqi_b,
>> > >> MSA128B>;
>> > >> > -class CEQI_H_DESC : MSA_SI5_DESC_BASE<"ceqi.h", int_mips_ceqi_h,
>> > >> MSA128H>;
>> > >> > -class CEQI_W_DESC : MSA_SI5_DESC_BASE<"ceqi.w",
>> int_mips_ceqi_w,
>> > >> MSA128W>;
>> > >> > -class CEQI_D_DESC : MSA_SI5_DESC_BASE<"ceqi.d", int_mips_ceqi_d,
>> > >> MSA128D>;
>> > >> > +class CEQI_B_DESC : MSA_SI5_DESC_BASE<"ceqi.b", vseteq_v16i8,
>> > >> vsplati8,
>> > >> > +                                      MSA128B>;
>> > >> > +class CEQI_H_DESC : MSA_SI5_DESC_BASE<"ceqi.h", vseteq_v8i16,
>> > >> vsplati16,
>> > >> > +                                      MSA128H>;
>> > >> > +class CEQI_W_DESC : MSA_SI5_DESC_BASE<"ceqi.w", vseteq_v4i32,
>> > >> vsplati32,
>> > >> > +                                      MSA128W>;
>> > >> > +class CEQI_D_DESC : MSA_SI5_DESC_BASE<"ceqi.d", vseteq_v2i64,
>> > >> vsplati64,
>> > >> > +                                      MSA128D>;
>> > >> >
>> > >> >  class CFCMSA_DESC {
>> > >> >    dag OutOperandList = (outs GPR32:$rd);
>> > >> > @@ -1250,61 +1329,61 @@ class CFCMSA_DESC {
>> > >> >    bit hasSideEffects = 1;
>> > >> >  }
>> > >> >
>> > >> > -class CLE_S_B_DESC : MSA_3R_DESC_BASE<"cle_s.b",
>> int_mips_cle_s_b,
>> > >> MSA128B>;
>> > >> > -class CLE_S_H_DESC : MSA_3R_DESC_BASE<"cle_s.h",
>> int_mips_cle_s_h,
>> > >> MSA128H>;
>> > >> > -class CLE_S_W_DESC : MSA_3R_DESC_BASE<"cle_s.w",
>> > >> int_mips_cle_s_w, MSA128W>;
>> > >> > -class CLE_S_D_DESC : MSA_3R_DESC_BASE<"cle_s.d",
>> int_mips_cle_s_d,
>> > >> MSA128D>;
>> > >> > -
>> > >> > -class CLE_U_B_DESC : MSA_3R_DESC_BASE<"cle_u.b",
>> > int_mips_cle_u_b,
>> > >> MSA128B>;
>> > >> > -class CLE_U_H_DESC : MSA_3R_DESC_BASE<"cle_u.h",
>> > int_mips_cle_u_h,
>> > >> MSA128H>;
>> > >> > -class CLE_U_W_DESC : MSA_3R_DESC_BASE<"cle_u.w",
>> > >> int_mips_cle_u_w, MSA128W>;
>> > >> > -class CLE_U_D_DESC : MSA_3R_DESC_BASE<"cle_u.d",
>> > int_mips_cle_u_d,
>> > >> MSA128D>;
>> > >> > +class CLE_S_B_DESC : MSA_3R_DESC_BASE<"cle_s.b", vsetle_v16i8,
>> > >> MSA128B>;
>> > >> > +class CLE_S_H_DESC : MSA_3R_DESC_BASE<"cle_s.h", vsetle_v8i16,
>> > >> MSA128H>;
>> > >> > +class CLE_S_W_DESC : MSA_3R_DESC_BASE<"cle_s.w", vsetle_v4i32,
>> > >> MSA128W>;
>> > >> > +class CLE_S_D_DESC : MSA_3R_DESC_BASE<"cle_s.d", vsetle_v2i64,
>> > >> MSA128D>;
>> > >> > +
>> > >> > +class CLE_U_B_DESC : MSA_3R_DESC_BASE<"cle_u.b", vsetule_v16i8,
>> > >> MSA128B>;
>> > >> > +class CLE_U_H_DESC : MSA_3R_DESC_BASE<"cle_u.h", vsetule_v8i16,
>> > >> MSA128H>;
>> > >> > +class CLE_U_W_DESC : MSA_3R_DESC_BASE<"cle_u.w", vsetule_v4i32,
>> > >> MSA128W>;
>> > >> > +class CLE_U_D_DESC : MSA_3R_DESC_BASE<"cle_u.d", vsetule_v2i64,
>> > >> MSA128D>;
>> > >> >
>> > >> > -class CLEI_S_B_DESC : MSA_SI5_DESC_BASE<"clei_s.b",
>> > >> int_mips_clei_s_b,
>> > >> > +class CLEI_S_B_DESC : MSA_SI5_DESC_BASE<"clei_s.b", vsetle_v16i8,
>> > >> vsplati8,
>> > >> >                                          MSA128B>;
>> > >> > -class CLEI_S_H_DESC : MSA_SI5_DESC_BASE<"clei_s.h",
>> > >> int_mips_clei_s_h,
>> > >> > +class CLEI_S_H_DESC : MSA_SI5_DESC_BASE<"clei_s.h", vsetle_v8i16,
>> > >> vsplati16,
>> > >> >                                          MSA128H>;
>> > >> > -class CLEI_S_W_DESC : MSA_SI5_DESC_BASE<"clei_s.w",
>> > >> int_mips_clei_s_w,
>> > >> > +class CLEI_S_W_DESC : MSA_SI5_DESC_BASE<"clei_s.w", vsetle_v4i32,
>> > >> vsplati32,
>> > >> >                                          MSA128W>;
>> > >> > -class CLEI_S_D_DESC : MSA_SI5_DESC_BASE<"clei_s.d",
>> > >> int_mips_clei_s_d,
>> > >> > +class CLEI_S_D_DESC : MSA_SI5_DESC_BASE<"clei_s.d", vsetle_v2i64,
>> > >> vsplati64,
>> > >> >                                          MSA128D>;
>> > >> >
>> > >> > -class CLEI_U_B_DESC : MSA_SI5_DESC_BASE<"clei_u.b",
>> > >> int_mips_clei_u_b,
>> > >> > -                                        MSA128B>;
>> > >> > -class CLEI_U_H_DESC : MSA_SI5_DESC_BASE<"clei_u.h",
>> > >> int_mips_clei_u_h,
>> > >> > -                                        MSA128H>;
>> > >> > -class CLEI_U_W_DESC : MSA_SI5_DESC_BASE<"clei_u.w",
>> > >> int_mips_clei_u_w,
>> > >> > -                                        MSA128W>;
>> > >> > -class CLEI_U_D_DESC : MSA_SI5_DESC_BASE<"clei_u.d",
>> > >> int_mips_clei_u_d,
>> > >> > -                                        MSA128D>;
>> > >> > +class CLEI_U_B_DESC : MSA_I5_DESC_BASE<"clei_u.b", vsetule_v16i8,
>> > >> vsplati8,
>> > >> > +                                       MSA128B>;
>> > >> > +class CLEI_U_H_DESC : MSA_I5_DESC_BASE<"clei_u.h", vsetule_v8i16,
>> > >> vsplati16,
>> > >> > +                                       MSA128H>;
>> > >> > +class CLEI_U_W_DESC : MSA_I5_DESC_BASE<"clei_u.w",
>> vsetule_v4i32,
>> > >> vsplati32,
>> > >> > +                                       MSA128W>;
>> > >> > +class CLEI_U_D_DESC : MSA_I5_DESC_BASE<"clei_u.d", vsetule_v2i64,
>> > >> vsplati64,
>> > >> > +                                       MSA128D>;
>> > >> > +
>> > >> > +class CLT_S_B_DESC : MSA_3R_DESC_BASE<"clt_s.b", vsetlt_v16i8,
>> > >> MSA128B>;
>> > >> > +class CLT_S_H_DESC : MSA_3R_DESC_BASE<"clt_s.h", vsetlt_v8i16,
>> > >> MSA128H>;
>> > >> > +class CLT_S_W_DESC : MSA_3R_DESC_BASE<"clt_s.w", vsetlt_v4i32,
>> > >> MSA128W>;
>> > >> > +class CLT_S_D_DESC : MSA_3R_DESC_BASE<"clt_s.d", vsetlt_v2i64,
>> > >> MSA128D>;
>> > >> > +
>> > >> > +class CLT_U_B_DESC : MSA_3R_DESC_BASE<"clt_u.b", vsetult_v16i8,
>> > >> MSA128B>;
>> > >> > +class CLT_U_H_DESC : MSA_3R_DESC_BASE<"clt_u.h", vsetult_v8i16,
>> > >> MSA128H>;
>> > >> > +class CLT_U_W_DESC : MSA_3R_DESC_BASE<"clt_u.w", vsetult_v4i32,
>> > >> MSA128W>;
>> > >> > +class CLT_U_D_DESC : MSA_3R_DESC_BASE<"clt_u.d", vsetult_v2i64,
>> > >> MSA128D>;
>> > >> >
>> > >> > -class CLT_S_B_DESC : MSA_3R_DESC_BASE<"clt_s.b",
>> int_mips_clt_s_b,
>> > >> MSA128B>;
>> > >> > -class CLT_S_H_DESC : MSA_3R_DESC_BASE<"clt_s.h",
>> int_mips_clt_s_h,
>> > >> MSA128H>;
>> > >> > -class CLT_S_W_DESC : MSA_3R_DESC_BASE<"clt_s.w",
>> > int_mips_clt_s_w,
>> > >> MSA128W>;
>> > >> > -class CLT_S_D_DESC : MSA_3R_DESC_BASE<"clt_s.d",
>> int_mips_clt_s_d,
>> > >> MSA128D>;
>> > >> > -
>> > >> > -class CLT_U_B_DESC : MSA_3R_DESC_BASE<"clt_u.b",
>> int_mips_clt_u_b,
>> > >> MSA128B>;
>> > >> > -class CLT_U_H_DESC : MSA_3R_DESC_BASE<"clt_u.h",
>> int_mips_clt_u_h,
>> > >> MSA128H>;
>> > >> > -class CLT_U_W_DESC : MSA_3R_DESC_BASE<"clt_u.w",
>> > >> int_mips_clt_u_w, MSA128W>;
>> > >> > -class CLT_U_D_DESC : MSA_3R_DESC_BASE<"clt_u.d",
>> int_mips_clt_u_d,
>> > >> MSA128D>;
>> > >> > -
>> > >> > -class CLTI_S_B_DESC : MSA_SI5_DESC_BASE<"clti_s.b",
>> > int_mips_clti_s_b,
>> > >> > +class CLTI_S_B_DESC : MSA_SI5_DESC_BASE<"clti_s.b", vsetlt_v16i8,
>> > >> vsplati8,
>> > >> >                                          MSA128B>;
>> > >> > -class CLTI_S_H_DESC : MSA_SI5_DESC_BASE<"clti_s.h",
>> > int_mips_clti_s_h,
>> > >> > +class CLTI_S_H_DESC : MSA_SI5_DESC_BASE<"clti_s.h", vsetlt_v8i16,
>> > >> vsplati16,
>> > >> >                                          MSA128H>;
>> > >> > -class CLTI_S_W_DESC : MSA_SI5_DESC_BASE<"clti_s.w",
>> > >> int_mips_clti_s_w,
>> > >> > +class CLTI_S_W_DESC : MSA_SI5_DESC_BASE<"clti_s.w", vsetlt_v4i32,
>> > >> vsplati32,
>> > >> >                                          MSA128W>;
>> > >> > -class CLTI_S_D_DESC : MSA_SI5_DESC_BASE<"clti_s.d",
>> > int_mips_clti_s_d,
>> > >> > +class CLTI_S_D_DESC : MSA_SI5_DESC_BASE<"clti_s.d", vsetlt_v2i64,
>> > >> vsplati64,
>> > >> >                                          MSA128D>;
>> > >> >
>> > >> > -class CLTI_U_B_DESC : MSA_SI5_DESC_BASE<"clti_u.b",
>> > >> int_mips_clti_u_b,
>> > >> > -                                        MSA128B>;
>> > >> > -class CLTI_U_H_DESC : MSA_SI5_DESC_BASE<"clti_u.h",
>> > >> int_mips_clti_u_h,
>> > >> > -                                        MSA128H>;
>> > >> > -class CLTI_U_W_DESC : MSA_SI5_DESC_BASE<"clti_u.w",
>> > >> int_mips_clti_u_w,
>> > >> > -                                        MSA128W>;
>> > >> > -class CLTI_U_D_DESC : MSA_SI5_DESC_BASE<"clti_u.d",
>> > >> int_mips_clti_u_d,
>> > >> > -                                        MSA128D>;
>> > >> > +class CLTI_U_B_DESC : MSA_I5_DESC_BASE<"clti_u.b", vsetult_v16i8,
>> > >> vsplati8,
>> > >> > +                                       MSA128B>;
>> > >> > +class CLTI_U_H_DESC : MSA_I5_DESC_BASE<"clti_u.h", vsetult_v8i16,
>> > >> vsplati16,
>> > >> > +                                       MSA128H>;
>> > >> > +class CLTI_U_W_DESC : MSA_I5_DESC_BASE<"clti_u.w", vsetult_v4i32,
>> > >> vsplati32,
>> > >> > +                                       MSA128W>;
>> > >> > +class CLTI_U_D_DESC : MSA_I5_DESC_BASE<"clti_u.d", vsetult_v2i64,
>> > >> vsplati64,
>> > >> > +                                       MSA128D>;
>> > >> >
>> > >> >  class COPY_S_B_DESC : MSA_COPY_DESC_BASE<"copy_s.b",
>> > >> vextract_sext_i8,  v16i8,
>> > >> >                                           GPR32, MSA128B>;
>> > >> > @@ -1394,9 +1473,9 @@ class FCAF_W_DESC :
>> MSA_3RF_DESC_BASE<"f
>> > >> >  class FCAF_D_DESC : MSA_3RF_DESC_BASE<"fcaf.d", int_mips_fcaf_d,
>> > >> MSA128D>,
>> > >> >                      IsCommutable;
>> > >> >
>> > >> > -class FCEQ_W_DESC : MSA_3RF_DESC_BASE<"fceq.w",
>> > int_mips_fceq_w,
>> > >> MSA128W>,
>> > >> > +class FCEQ_W_DESC : MSA_3RF_DESC_BASE<"fceq.w",
>> vfsetoeq_v4f32,
>> > >> MSA128W>,
>> > >> >                      IsCommutable;
>> > >> > -class FCEQ_D_DESC : MSA_3RF_DESC_BASE<"fceq.d",
>> int_mips_fceq_d,
>> > >> MSA128D>,
>> > >> > +class FCEQ_D_DESC : MSA_3RF_DESC_BASE<"fceq.d", vfsetoeq_v2f64,
>> > >> MSA128D>,
>> > >> >                      IsCommutable;
>> > >> >
>> > >> >  class FCLASS_W_DESC : MSA_2RF_DESC_BASE<"fclass.w",
>> > >> int_mips_fclass_w,
>> > >> > @@ -1404,45 +1483,45 @@ class FCLASS_W_DESC :
>> > MSA_2RF_DESC_BASE<
>> > >> >  class FCLASS_D_DESC : MSA_2RF_DESC_BASE<"fclass.d",
>> > >> int_mips_fclass_d,
>> > >> >                                          MSA128D>;
>> > >> >
>> > >> > -class FCLE_W_DESC : MSA_3RF_DESC_BASE<"fcle.w",
>> int_mips_fcle_w,
>> > >> MSA128W>;
>> > >> > -class FCLE_D_DESC : MSA_3RF_DESC_BASE<"fcle.d", int_mips_fcle_d,
>> > >> MSA128D>;
>> > >> > +class FCLE_W_DESC : MSA_3RF_DESC_BASE<"fcle.w", vfsetole_v4f32,
>> > >> MSA128W>;
>> > >> > +class FCLE_D_DESC : MSA_3RF_DESC_BASE<"fcle.d", vfsetole_v2f64,
>> > >> MSA128D>;
>> > >> >
>> > >> > -class FCLT_W_DESC : MSA_3RF_DESC_BASE<"fclt.w", int_mips_fclt_w,
>> > >> MSA128W>;
>> > >> > -class FCLT_D_DESC : MSA_3RF_DESC_BASE<"fclt.d", int_mips_fclt_d,
>> > >> MSA128D>;
>> > >> > +class FCLT_W_DESC : MSA_3RF_DESC_BASE<"fclt.w", vfsetolt_v4f32,
>> > >> MSA128W>;
>> > >> > +class FCLT_D_DESC : MSA_3RF_DESC_BASE<"fclt.d", vfsetolt_v2f64,
>> > >> MSA128D>;
>> > >> >
>> > >> > -class FCNE_W_DESC : MSA_3RF_DESC_BASE<"fcne.w",
>> > int_mips_fcne_w,
>> > >> MSA128W>,
>> > >> > +class FCNE_W_DESC : MSA_3RF_DESC_BASE<"fcne.w",
>> vfsetone_v4f32,
>> > >> MSA128W>,
>> > >> >                      IsCommutable;
>> > >> > -class FCNE_D_DESC : MSA_3RF_DESC_BASE<"fcne.d",
>> int_mips_fcne_d,
>> > >> MSA128D>,
>> > >> > +class FCNE_D_DESC : MSA_3RF_DESC_BASE<"fcne.d", vfsetone_v2f64,
>> > >> MSA128D>,
>> > >> >                      IsCommutable;
>> > >> >
>> > >> > -class FCOR_W_DESC : MSA_3RF_DESC_BASE<"fcor.w",
>> int_mips_fcor_w,
>> > >> MSA128W>,
>> > >> > +class FCOR_W_DESC : MSA_3RF_DESC_BASE<"fcor.w",
>> vfsetord_v4f32,
>> > >> MSA128W>,
>> > >> >                      IsCommutable;
>> > >> > -class FCOR_D_DESC : MSA_3RF_DESC_BASE<"fcor.d", int_mips_fcor_d,
>> > >> MSA128D>,
>> > >> > +class FCOR_D_DESC : MSA_3RF_DESC_BASE<"fcor.d", vfsetord_v2f64,
>> > >> MSA128D>,
>> > >> >                      IsCommutable;
>> > >> >
>> > >> > -class FCUEQ_W_DESC : MSA_3RF_DESC_BASE<"fcueq.w",
>> > >> int_mips_fcueq_w, MSA128W>,
>> > >> > +class FCUEQ_W_DESC : MSA_3RF_DESC_BASE<"fcueq.w",
>> > >> vfsetueq_v4f32, MSA128W>,
>> > >> >                       IsCommutable;
>> > >> > -class FCUEQ_D_DESC : MSA_3RF_DESC_BASE<"fcueq.d",
>> > >> int_mips_fcueq_d, MSA128D>,
>> > >> > +class FCUEQ_D_DESC : MSA_3RF_DESC_BASE<"fcueq.d",
>> > vfsetueq_v2f64,
>> > >> MSA128D>,
>> > >> >                       IsCommutable;
>> > >> >
>> > >> > -class FCULE_W_DESC : MSA_3RF_DESC_BASE<"fcule.w",
>> > >> int_mips_fcule_w, MSA128W>,
>> > >> > +class FCULE_W_DESC : MSA_3RF_DESC_BASE<"fcule.w",
>> > vfsetule_v4f32,
>> > >> MSA128W>,
>> > >> >                       IsCommutable;
>> > >> > -class FCULE_D_DESC : MSA_3RF_DESC_BASE<"fcule.d",
>> > int_mips_fcule_d,
>> > >> MSA128D>,
>> > >> > +class FCULE_D_DESC : MSA_3RF_DESC_BASE<"fcule.d",
>> vfsetule_v2f64,
>> > >> MSA128D>,
>> > >> >                       IsCommutable;
>> > >> >
>> > >> > -class FCULT_W_DESC : MSA_3RF_DESC_BASE<"fcult.w",
>> > >> int_mips_fcult_w, MSA128W>,
>> > >> > +class FCULT_W_DESC : MSA_3RF_DESC_BASE<"fcult.w",
>> vfsetult_v4f32,
>> > >> MSA128W>,
>> > >> >                       IsCommutable;
>> > >> > -class FCULT_D_DESC : MSA_3RF_DESC_BASE<"fcult.d",
>> > int_mips_fcult_d,
>> > >> MSA128D>,
>> > >> > +class FCULT_D_DESC : MSA_3RF_DESC_BASE<"fcult.d", vfsetult_v2f64,
>> > >> MSA128D>,
>> > >> >                       IsCommutable;
>> > >> >
>> > >> > -class FCUN_W_DESC : MSA_3RF_DESC_BASE<"fcun.w",
>> > int_mips_fcun_w,
>> > >> MSA128W>,
>> > >> > +class FCUN_W_DESC : MSA_3RF_DESC_BASE<"fcun.w", vfsetun_v4f32,
>> > >> MSA128W>,
>> > >> >                      IsCommutable;
>> > >> > -class FCUN_D_DESC : MSA_3RF_DESC_BASE<"fcun.d",
>> int_mips_fcun_d,
>> > >> MSA128D>,
>> > >> > +class FCUN_D_DESC : MSA_3RF_DESC_BASE<"fcun.d", vfsetun_v2f64,
>> > >> MSA128D>,
>> > >> >                      IsCommutable;
>> > >> >
>> > >> > -class FCUNE_W_DESC : MSA_3RF_DESC_BASE<"fcune.w",
>> > >> int_mips_fcune_w, MSA128W>,
>> > >> > +class FCUNE_W_DESC : MSA_3RF_DESC_BASE<"fcune.w",
>> > >> vfsetune_v4f32, MSA128W>,
>> > >> >                       IsCommutable;
>> > >> > -class FCUNE_D_DESC : MSA_3RF_DESC_BASE<"fcune.d",
>> > >> int_mips_fcune_d, MSA128D>,
>> > >> > +class FCUNE_D_DESC : MSA_3RF_DESC_BASE<"fcune.d",
>> > vfsetune_v2f64,
>> > >> MSA128D>,
>> > >> >                       IsCommutable;
>> > >> >
>> > >> >  class FDIV_W_DESC : MSA_3RF_DESC_BASE<"fdiv.w", fdiv,
>> MSA128W>;
>> > >> >
>> > >> > Modified: llvm/trunk/lib/Target/Mips/MipsSEISelLowering.cpp
>> > >> > URL: http://llvm.org/viewvc/llvm-
>> > >>
>> >
>> project/llvm/trunk/lib/Target/Mips/MipsSEISelLowering.cpp?rev=191286&r
>> > 1
>> > >> =191285&r2=191286&view=diff
>> > >> >
>> > >> ==========================================================
>> > >> ====================
>> > >> > --- llvm/trunk/lib/Target/Mips/MipsSEISelLowering.cpp (original)
>> > >> > +++ llvm/trunk/lib/Target/Mips/MipsSEISelLowering.cpp Tue Sep 24
>> > >> 05:46:19 2013
>> > >> > @@ -180,6 +180,13 @@ addMSAIntType(MVT::SimpleValueType Ty, c
>> > >> >    setOperationAction(ISD::SUB, Ty, Legal);
>> > >> >    setOperationAction(ISD::UDIV, Ty, Legal);
>> > >> >    setOperationAction(ISD::XOR, Ty, Legal);
>> > >> > +
>> > >> > +  setOperationAction(ISD::SETCC, Ty, Legal);
>> > >> > +  setCondCodeAction(ISD::SETNE, Ty, Expand);
>> > >> > +  setCondCodeAction(ISD::SETGE, Ty, Expand);
>> > >> > +  setCondCodeAction(ISD::SETGT, Ty, Expand);
>> > >> > +  setCondCodeAction(ISD::SETUGE, Ty, Expand);
>> > >> > +  setCondCodeAction(ISD::SETUGT, Ty, Expand);
>> > >> >  }
>> > >> >
>> > >> >  // Enable MSA support for the given floating-point type and Register
>> > class.
>> > >> > @@ -204,6 +211,14 @@ addMSAFloatType(MVT::SimpleValueType Ty,
>> > >> >      setOperationAction(ISD::FRINT, Ty, Legal);
>> > >> >      setOperationAction(ISD::FSQRT, Ty, Legal);
>> > >> >      setOperationAction(ISD::FSUB,  Ty, Legal);
>> > >> > +
>> > >> > +    setOperationAction(ISD::SETCC, Ty, Legal);
>> > >> > +    setCondCodeAction(ISD::SETOGE, Ty, Expand);
>> > >> > +    setCondCodeAction(ISD::SETOGT, Ty, Expand);
>> > >> > +    setCondCodeAction(ISD::SETUGE, Ty, Expand);
>> > >> > +    setCondCodeAction(ISD::SETUGT, Ty, Expand);
>> > >> > +    setCondCodeAction(ISD::SETGE,  Ty, Expand);
>> > >> > +    setCondCodeAction(ISD::SETGT,  Ty, Expand);
>> > >> >    }
>> > >> >  }
>> > >> >
>> > >> > @@ -1109,6 +1124,66 @@ SDValue MipsSETargetLowering::lowerINTRI
>> > >> >      return lowerMSABranchIntr(Op, DAG, MipsISD::VALL_ZERO);
>> > >> >    case Intrinsic::mips_bz_v:
>> > >> >      return lowerMSABranchIntr(Op, DAG, MipsISD::VANY_ZERO);
>> > >> > +  case Intrinsic::mips_ceq_b:
>> > >> > +  case Intrinsic::mips_ceq_h:
>> > >> > +  case Intrinsic::mips_ceq_w:
>> > >> > +  case Intrinsic::mips_ceq_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETEQ);
>> > >> > +  case Intrinsic::mips_ceqi_b:
>> > >> > +  case Intrinsic::mips_ceqi_h:
>> > >> > +  case Intrinsic::mips_ceqi_w:
>> > >> > +  case Intrinsic::mips_ceqi_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        lowerMSASplatImm(Op, 2, DAG), ISD::SETEQ);
>> > >> > +  case Intrinsic::mips_cle_s_b:
>> > >> > +  case Intrinsic::mips_cle_s_h:
>> > >> > +  case Intrinsic::mips_cle_s_w:
>> > >> > +  case Intrinsic::mips_cle_s_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETLE);
>> > >> > +  case Intrinsic::mips_clei_s_b:
>> > >> > +  case Intrinsic::mips_clei_s_h:
>> > >> > +  case Intrinsic::mips_clei_s_w:
>> > >> > +  case Intrinsic::mips_clei_s_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        lowerMSASplatImm(Op, 2, DAG), ISD::SETLE);
>> > >> > +  case Intrinsic::mips_cle_u_b:
>> > >> > +  case Intrinsic::mips_cle_u_h:
>> > >> > +  case Intrinsic::mips_cle_u_w:
>> > >> > +  case Intrinsic::mips_cle_u_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETULE);
>> > >> > +  case Intrinsic::mips_clei_u_b:
>> > >> > +  case Intrinsic::mips_clei_u_h:
>> > >> > +  case Intrinsic::mips_clei_u_w:
>> > >> > +  case Intrinsic::mips_clei_u_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        lowerMSASplatImm(Op, 2, DAG), ISD::SETULE);
>> > >> > +  case Intrinsic::mips_clt_s_b:
>> > >> > +  case Intrinsic::mips_clt_s_h:
>> > >> > +  case Intrinsic::mips_clt_s_w:
>> > >> > +  case Intrinsic::mips_clt_s_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETLT);
>> > >> > +  case Intrinsic::mips_clti_s_b:
>> > >> > +  case Intrinsic::mips_clti_s_h:
>> > >> > +  case Intrinsic::mips_clti_s_w:
>> > >> > +  case Intrinsic::mips_clti_s_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        lowerMSASplatImm(Op, 2, DAG), ISD::SETLT);
>> > >> > +  case Intrinsic::mips_clt_u_b:
>> > >> > +  case Intrinsic::mips_clt_u_h:
>> > >> > +  case Intrinsic::mips_clt_u_w:
>> > >> > +  case Intrinsic::mips_clt_u_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETULT);
>> > >> > +  case Intrinsic::mips_clti_u_b:
>> > >> > +  case Intrinsic::mips_clti_u_h:
>> > >> > +  case Intrinsic::mips_clti_u_w:
>> > >> > +  case Intrinsic::mips_clti_u_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        lowerMSASplatImm(Op, 2, DAG), ISD::SETULT);
>> > >> >    case Intrinsic::mips_copy_s_b:
>> > >> >    case Intrinsic::mips_copy_s_h:
>> > >> >    case Intrinsic::mips_copy_s_w:
>> > >> > @@ -1130,6 +1205,47 @@ SDValue MipsSETargetLowering::lowerINTRI
>> > >> >    case Intrinsic::mips_fadd_w:
>> > >> >    case Intrinsic::mips_fadd_d:
>> > >> >      return lowerMSABinaryIntr(Op, DAG, ISD::FADD);
>> > >> > +  // Don't lower mips_fcaf_[wd] since LLVM folds SETFALSE condcodes
>> > >> away
>> > >> > +  case Intrinsic::mips_fceq_w:
>> > >> > +  case Intrinsic::mips_fceq_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETOEQ);
>> > >> > +  case Intrinsic::mips_fcle_w:
>> > >> > +  case Intrinsic::mips_fcle_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETOLE);
>> > >> > +  case Intrinsic::mips_fclt_w:
>> > >> > +  case Intrinsic::mips_fclt_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETOLT);
>> > >> > +  case Intrinsic::mips_fcne_w:
>> > >> > +  case Intrinsic::mips_fcne_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETONE);
>> > >> > +  case Intrinsic::mips_fcor_w:
>> > >> > +  case Intrinsic::mips_fcor_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETO);
>> > >> > +  case Intrinsic::mips_fcueq_w:
>> > >> > +  case Intrinsic::mips_fcueq_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETUEQ);
>> > >> > +  case Intrinsic::mips_fcule_w:
>> > >> > +  case Intrinsic::mips_fcule_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETULE);
>> > >> > +  case Intrinsic::mips_fcult_w:
>> > >> > +  case Intrinsic::mips_fcult_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETULT);
>> > >> > +  case Intrinsic::mips_fcun_w:
>> > >> > +  case Intrinsic::mips_fcun_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETUO);
>> > >> > +  case Intrinsic::mips_fcune_w:
>> > >> > +  case Intrinsic::mips_fcune_d:
>> > >> > +    return DAG.getSetCC(SDLoc(Op), Op->getValueType(0), Op-
>> > >> >getOperand(1),
>> > >> > +                        Op->getOperand(2), ISD::SETUNE);
>> > >> >    case Intrinsic::mips_fdiv_w:
>> > >> >    case Intrinsic::mips_fdiv_d:
>> > >> >      return lowerMSABinaryIntr(Op, DAG, ISD::FDIV);
>> > >> >
>> > >> > Added: llvm/trunk/test/CodeGen/Mips/msa/compare.ll
>> > >> > URL: http://llvm.org/viewvc/llvm-
>> > >>
>> >
>> project/llvm/trunk/test/CodeGen/Mips/msa/compare.ll?rev=191286&view
>> > =
>> > >> auto
>> > >> >
>> > >> ==========================================================
>> > >> ====================
>> > >> > --- llvm/trunk/test/CodeGen/Mips/msa/compare.ll (added)
>> > >> > +++ llvm/trunk/test/CodeGen/Mips/msa/compare.ll Tue Sep 24
>> > 05:46:19
>> > >> 2013
>> > >> > @@ -0,0 +1,641 @@
>> > >> > +; RUN: llc -march=mips -mattr=+msa < %s | FileCheck %s
>> > >> > +
>> > >> > +define void @ceq_v16i8(<16 x i8>* %c, <16 x i8>* %a, <16 x i8>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ceq_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <16 x i8>* %b
>> > >> > +  ; CHECK-DAG: ld.b [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp eq <16 x i8> %1, %2
>> > >> > +  %4 = sext <16 x i1> %3 to <16 x i8>
>> > >> > +  ; CHECK-DAG: ceq.b [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <16 x i8> %4, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ceq_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @ceq_v8i16(<8 x i16>* %c, <8 x i16>* %a, <8 x i16>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ceq_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <8 x i16>* %b
>> > >> > +  ; CHECK-DAG: ld.h [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp eq <8 x i16> %1, %2
>> > >> > +  %4 = sext <8 x i1> %3 to <8 x i16>
>> > >> > +  ; CHECK-DAG: ceq.h [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <8 x i16> %4, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ceq_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @ceq_v4i32(<4 x i32>* %c, <4 x i32>* %a, <4 x i32>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ceq_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x i32>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp eq <4 x i32> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: ceq.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ceq_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @ceq_v2i64(<2 x i64>* %c, <2 x i64>* %a, <2 x i64>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ceq_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x i64>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp eq <2 x i64> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: ceq.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ceq_v2i64
>> > >> > +}
>> > >> > +
>> > >> > +define void @cle_s_v16i8(<16 x i8>* %c, <16 x i8>* %a, <16 x i8>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: cle_s_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <16 x i8>* %b
>> > >> > +  ; CHECK-DAG: ld.b [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp sle <16 x i8> %1, %2
>> > >> > +  %4 = sext <16 x i1> %3 to <16 x i8>
>> > >> > +  ; CHECK-DAG: cle_s.b [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <16 x i8> %4, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size cle_s_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @cle_s_v8i16(<8 x i16>* %c, <8 x i16>* %a, <8 x i16>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: cle_s_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <8 x i16>* %b
>> > >> > +  ; CHECK-DAG: ld.h [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp sle <8 x i16> %1, %2
>> > >> > +  %4 = sext <8 x i1> %3 to <8 x i16>
>> > >> > +  ; CHECK-DAG: cle_s.h [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <8 x i16> %4, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size cle_s_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @cle_s_v4i32(<4 x i32>* %c, <4 x i32>* %a, <4 x i32>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: cle_s_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x i32>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp sle <4 x i32> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: cle_s.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size cle_s_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @cle_s_v2i64(<2 x i64>* %c, <2 x i64>* %a, <2 x i64>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: cle_s_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x i64>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp sle <2 x i64> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: cle_s.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size cle_s_v2i64
>> > >> > +}
>> > >> > +
>> > >> > +define void @cle_u_v16i8(<16 x i8>* %c, <16 x i8>* %a, <16 x i8>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: cle_u_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <16 x i8>* %b
>> > >> > +  ; CHECK-DAG: ld.b [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp ule <16 x i8> %1, %2
>> > >> > +  %4 = sext <16 x i1> %3 to <16 x i8>
>> > >> > +  ; CHECK-DAG: cle_u.b [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <16 x i8> %4, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size cle_u_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @cle_u_v8i16(<8 x i16>* %c, <8 x i16>* %a, <8 x i16>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: cle_u_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <8 x i16>* %b
>> > >> > +  ; CHECK-DAG: ld.h [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp ule <8 x i16> %1, %2
>> > >> > +  %4 = sext <8 x i1> %3 to <8 x i16>
>> > >> > +  ; CHECK-DAG: cle_u.h [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <8 x i16> %4, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size cle_u_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @cle_u_v4i32(<4 x i32>* %c, <4 x i32>* %a, <4 x i32>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: cle_u_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x i32>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp ule <4 x i32> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: cle_u.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size cle_u_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @cle_u_v2i64(<2 x i64>* %c, <2 x i64>* %a, <2 x i64>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: cle_u_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x i64>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp ule <2 x i64> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: cle_u.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size cle_u_v2i64
>> > >> > +}
>> > >> > +
>> > >> > +define void @clt_s_v16i8(<16 x i8>* %c, <16 x i8>* %a, <16 x i8>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: clt_s_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <16 x i8>* %b
>> > >> > +  ; CHECK-DAG: ld.b [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp slt <16 x i8> %1, %2
>> > >> > +  %4 = sext <16 x i1> %3 to <16 x i8>
>> > >> > +  ; CHECK-DAG: clt_s.b [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <16 x i8> %4, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clt_s_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @clt_s_v8i16(<8 x i16>* %c, <8 x i16>* %a, <8 x i16>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: clt_s_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <8 x i16>* %b
>> > >> > +  ; CHECK-DAG: ld.h [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp slt <8 x i16> %1, %2
>> > >> > +  %4 = sext <8 x i1> %3 to <8 x i16>
>> > >> > +  ; CHECK-DAG: clt_s.h [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <8 x i16> %4, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clt_s_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @clt_s_v4i32(<4 x i32>* %c, <4 x i32>* %a, <4 x i32>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: clt_s_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x i32>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp slt <4 x i32> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: clt_s.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clt_s_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @clt_s_v2i64(<2 x i64>* %c, <2 x i64>* %a, <2 x i64>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: clt_s_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x i64>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp slt <2 x i64> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: clt_s.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clt_s_v2i64
>> > >> > +}
>> > >> > +
>> > >> > +define void @clt_u_v16i8(<16 x i8>* %c, <16 x i8>* %a, <16 x i8>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: clt_u_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <16 x i8>* %b
>> > >> > +  ; CHECK-DAG: ld.b [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp ult <16 x i8> %1, %2
>> > >> > +  %4 = sext <16 x i1> %3 to <16 x i8>
>> > >> > +  ; CHECK-DAG: clt_u.b [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <16 x i8> %4, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clt_u_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @clt_u_v8i16(<8 x i16>* %c, <8 x i16>* %a, <8 x i16>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: clt_u_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <8 x i16>* %b
>> > >> > +  ; CHECK-DAG: ld.h [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp ult <8 x i16> %1, %2
>> > >> > +  %4 = sext <8 x i1> %3 to <8 x i16>
>> > >> > +  ; CHECK-DAG: clt_u.h [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <8 x i16> %4, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clt_u_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @clt_u_v4i32(<4 x i32>* %c, <4 x i32>* %a, <4 x i32>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: clt_u_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x i32>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp ult <4 x i32> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: clt_u.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clt_u_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @clt_u_v2i64(<2 x i64>* %c, <2 x i64>* %a, <2 x i64>* %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: clt_u_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x i64>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = icmp ult <2 x i64> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: clt_u.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clt_u_v2i64
>> > >> > +}
>> > >> > +
>> > >> > +define void @ceqi_v16i8(<16 x i8>* %c, <16 x i8>* %a) nounwind {
>> > >> > +  ; CHECK: ceqi_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp eq <16 x i8> %1, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1,
>> > i8 1,
>> > >> i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
>> > >> > +  %3 = sext <16 x i1> %2 to <16 x i8>
>> > >> > +  ; CHECK-DAG: ceqi.b [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <16 x i8> %3, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ceqi_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @ceqi_v8i16(<8 x i16>* %c, <8 x i16>* %a) nounwind {
>> > >> > +  ; CHECK: ceqi_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp eq <8 x i16> %1, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1,
>> > i16 1>
>> > >> > +  %3 = sext <8 x i1> %2 to <8 x i16>
>> > >> > +  ; CHECK-DAG: ceqi.h [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <8 x i16> %3, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ceqi_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @ceqi_v4i32(<4 x i32>* %c, <4 x i32>* %a) nounwind {
>> > >> > +  ; CHECK: ceqi_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp eq <4 x i32> %1, <i32 1, i32 1, i32 1, i32 1>
>> > >> > +  %3 = sext <4 x i1> %2 to <4 x i32>
>> > >> > +  ; CHECK-DAG: ceqi.w [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <4 x i32> %3, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ceqi_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @ceqi_v2i64(<2 x i64>* %c, <2 x i64>* %a) nounwind {
>> > >> > +  ; CHECK: ceqi_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp eq <2 x i64> %1, <i64 1, i64 1>
>> > >> > +  %3 = sext <2 x i1> %2 to <2 x i64>
>> > >> > +  ; CHECK-DAG: ceqi.d [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <2 x i64> %3, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ceqi_v2i64
>> > >> > +}
>> > >> > +
>> > >> > +define void @clei_s_v16i8(<16 x i8>* %c, <16 x i8>* %a) nounwind {
>> > >> > +  ; CHECK: clei_s_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp sle <16 x i8> %1, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8
>> 1,
>> > i8 1,
>> > >> i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
>> > >> > +  %3 = sext <16 x i1> %2 to <16 x i8>
>> > >> > +  ; CHECK-DAG: clei_s.b [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <16 x i8> %3, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clei_s_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @clei_s_v8i16(<8 x i16>* %c, <8 x i16>* %a) nounwind {
>> > >> > +  ; CHECK: clei_s_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp sle <8 x i16> %1, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1,
>> > i16
>> > >> 1>
>> > >> > +  %3 = sext <8 x i1> %2 to <8 x i16>
>> > >> > +  ; CHECK-DAG: clei_s.h [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <8 x i16> %3, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clei_s_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @clei_s_v4i32(<4 x i32>* %c, <4 x i32>* %a) nounwind {
>> > >> > +  ; CHECK: clei_s_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp sle <4 x i32> %1, <i32 1, i32 1, i32 1, i32 1>
>> > >> > +  %3 = sext <4 x i1> %2 to <4 x i32>
>> > >> > +  ; CHECK-DAG: clei_s.w [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <4 x i32> %3, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clei_s_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @clei_s_v2i64(<2 x i64>* %c, <2 x i64>* %a) nounwind {
>> > >> > +  ; CHECK: clei_s_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp sle <2 x i64> %1, <i64 1, i64 1>
>> > >> > +  %3 = sext <2 x i1> %2 to <2 x i64>
>> > >> > +  ; CHECK-DAG: clei_s.d [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <2 x i64> %3, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clei_s_v2i64
>> > >> > +}
>> > >> > +
>> > >> > +define void @clei_u_v16i8(<16 x i8>* %c, <16 x i8>* %a) nounwind {
>> > >> > +  ; CHECK: clei_u_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp ule <16 x i8> %1, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8
>> 1,
>> > i8 1,
>> > >> i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
>> > >> > +  %3 = sext <16 x i1> %2 to <16 x i8>
>> > >> > +  ; CHECK-DAG: clei_u.b [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <16 x i8> %3, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clei_u_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @clei_u_v8i16(<8 x i16>* %c, <8 x i16>* %a) nounwind {
>> > >> > +  ; CHECK: clei_u_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp ule <8 x i16> %1, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1,
>> > i16
>> > >> 1>
>> > >> > +  %3 = sext <8 x i1> %2 to <8 x i16>
>> > >> > +  ; CHECK-DAG: clei_u.h [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <8 x i16> %3, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clei_u_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @clei_u_v4i32(<4 x i32>* %c, <4 x i32>* %a) nounwind {
>> > >> > +  ; CHECK: clei_u_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp ule <4 x i32> %1, <i32 1, i32 1, i32 1, i32 1>
>> > >> > +  %3 = sext <4 x i1> %2 to <4 x i32>
>> > >> > +  ; CHECK-DAG: clei_u.w [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <4 x i32> %3, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clei_u_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @clei_u_v2i64(<2 x i64>* %c, <2 x i64>* %a) nounwind {
>> > >> > +  ; CHECK: clei_u_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp ule <2 x i64> %1, <i64 1, i64 1>
>> > >> > +  %3 = sext <2 x i1> %2 to <2 x i64>
>> > >> > +  ; CHECK-DAG: clei_u.d [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <2 x i64> %3, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clei_u_v2i64
>> > >> > +}
>> > >> > +
>> > >> > +define void @clti_s_v16i8(<16 x i8>* %c, <16 x i8>* %a) nounwind {
>> > >> > +  ; CHECK: clti_s_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp slt <16 x i8> %1, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1,
>> > i8 1,
>> > >> i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
>> > >> > +  %3 = sext <16 x i1> %2 to <16 x i8>
>> > >> > +  ; CHECK-DAG: clti_s.b [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <16 x i8> %3, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clti_s_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @clti_s_v8i16(<8 x i16>* %c, <8 x i16>* %a) nounwind {
>> > >> > +  ; CHECK: clti_s_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp slt <8 x i16> %1, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1,
>> > i16 1>
>> > >> > +  %3 = sext <8 x i1> %2 to <8 x i16>
>> > >> > +  ; CHECK-DAG: clti_s.h [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <8 x i16> %3, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clti_s_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @clti_s_v4i32(<4 x i32>* %c, <4 x i32>* %a) nounwind {
>> > >> > +  ; CHECK: clti_s_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp slt <4 x i32> %1, <i32 1, i32 1, i32 1, i32 1>
>> > >> > +  %3 = sext <4 x i1> %2 to <4 x i32>
>> > >> > +  ; CHECK-DAG: clti_s.w [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <4 x i32> %3, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clti_s_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @clti_s_v2i64(<2 x i64>* %c, <2 x i64>* %a) nounwind {
>> > >> > +  ; CHECK: clti_s_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp slt <2 x i64> %1, <i64 1, i64 1>
>> > >> > +  %3 = sext <2 x i1> %2 to <2 x i64>
>> > >> > +  ; CHECK-DAG: clti_s.d [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <2 x i64> %3, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clti_s_v2i64
>> > >> > +}
>> > >> > +
>> > >> > +define void @clti_u_v16i8(<16 x i8>* %c, <16 x i8>* %a) nounwind {
>> > >> > +  ; CHECK: clti_u_v16i8:
>> > >> > +
>> > >> > +  %1 = load <16 x i8>* %a
>> > >> > +  ; CHECK-DAG: ld.b [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp ult <16 x i8> %1, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1,
>> > i8 1,
>> > >> i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
>> > >> > +  %3 = sext <16 x i1> %2 to <16 x i8>
>> > >> > +  ; CHECK-DAG: clti_u.b [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <16 x i8> %3, <16 x i8>* %c
>> > >> > +  ; CHECK-DAG: st.b [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clti_u_v16i8
>> > >> > +}
>> > >> > +
>> > >> > +define void @clti_u_v8i16(<8 x i16>* %c, <8 x i16>* %a) nounwind {
>> > >> > +  ; CHECK: clti_u_v8i16:
>> > >> > +
>> > >> > +  %1 = load <8 x i16>* %a
>> > >> > +  ; CHECK-DAG: ld.h [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp ult <8 x i16> %1, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1,
>> > i16 1>
>> > >> > +  %3 = sext <8 x i1> %2 to <8 x i16>
>> > >> > +  ; CHECK-DAG: clti_u.h [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <8 x i16> %3, <8 x i16>* %c
>> > >> > +  ; CHECK-DAG: st.h [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clti_u_v8i16
>> > >> > +}
>> > >> > +
>> > >> > +define void @clti_u_v4i32(<4 x i32>* %c, <4 x i32>* %a) nounwind {
>> > >> > +  ; CHECK: clti_u_v4i32:
>> > >> > +
>> > >> > +  %1 = load <4 x i32>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp ult <4 x i32> %1, <i32 1, i32 1, i32 1, i32 1>
>> > >> > +  %3 = sext <4 x i1> %2 to <4 x i32>
>> > >> > +  ; CHECK-DAG: clti_u.w [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <4 x i32> %3, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clti_u_v4i32
>> > >> > +}
>> > >> > +
>> > >> > +define void @clti_u_v2i64(<2 x i64>* %c, <2 x i64>* %a) nounwind {
>> > >> > +  ; CHECK: clti_u_v2i64:
>> > >> > +
>> > >> > +  %1 = load <2 x i64>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = icmp ult <2 x i64> %1, <i64 1, i64 1>
>> > >> > +  %3 = sext <2 x i1> %2 to <2 x i64>
>> > >> > +  ; CHECK-DAG: clti_u.d [[R3:\$w[0-9]+]], [[R1]], 1
>> > >> > +  store <2 x i64> %3, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size clti_u_v2i64
>> > >> > +}
>> > >> >
>> > >> > Added: llvm/trunk/test/CodeGen/Mips/msa/compare_float.ll
>> > >> > URL: http://llvm.org/viewvc/llvm-
>> > >>
>> >
>> project/llvm/trunk/test/CodeGen/Mips/msa/compare_float.ll?rev=191286&
>> > >> view=auto
>> > >> >
>> > >> ==========================================================
>> > >> ====================
>> > >> > --- llvm/trunk/test/CodeGen/Mips/msa/compare_float.ll (added)
>> > >> > +++ llvm/trunk/test/CodeGen/Mips/msa/compare_float.ll Tue Sep 24
>> > >> 05:46:19 2013
>> > >> > @@ -0,0 +1,518 @@
>> > >> > +; RUN: llc -march=mips -mattr=+msa < %s | FileCheck %s
>> > >> > +
>> > >> > +declare <4 x float> @llvm.mips.fmax.w(<4 x float>, <4 x float>)
>> > nounwind
>> > >> > +declare <2 x double> @llvm.mips.fmax.d(<2 x double>, <2 x double>)
>> > >> nounwind
>> > >> > +declare <4 x float> @llvm.mips.fmin.w(<4 x float>, <4 x float>)
>> > nounwind
>> > >> > +declare <2 x double> @llvm.mips.fmin.d(<2 x double>, <2 x double>)
>> > >> nounwind
>> > >> > +
>> > >> > +define void @false_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> > %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: false_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  %3 = fcmp false <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ret void
>> > >> > +
>> > >> > +  ; (setcc $a, $b, SETFALSE) is always folded, so we won't get fcaf:
>> > >> > +  ; CHECK-DAG: ldi.b [[R1:\$w[0-9]+]], 0
>> > >> > +  ; CHECK-DAG: st.b [[R1]], 0($4)
>> > >> > +  ; CHECK: .size false_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @false_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> > double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: false_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  %3 = fcmp false <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ret void
>> > >> > +
>> > >> > +  ; FIXME: This code is correct, but poor. Ideally it would be similar to
>> > >> > +  ;        the code in @false_v4f32
>> > >> > +  ; CHECK-DAG: ldi.b [[R1:\$w[0-9]+]], 0
>> > >> > +  ; CHECK-DAG: slli.d [[R3:\$w[0-9]+]], [[R1]], 63
>> > >> > +  ; CHECK-DAG: srai.d [[R4:\$w[0-9]+]], [[R3]], 63
>> > >> > +  ; CHECK-DAG: st.d [[R4]], 0($4)
>> > >> > +  ; CHECK: .size false_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @oeq_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> > %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: oeq_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp oeq <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fceq.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size oeq_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @oeq_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> > double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: oeq_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp oeq <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fceq.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size oeq_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @oge_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: oge_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp oge <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcle.w [[R3:\$w[0-9]+]], [[R2]], [[R1]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size oge_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @oge_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> > double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: oge_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp oge <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcle.d [[R3:\$w[0-9]+]], [[R2]], [[R1]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size oge_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @ogt_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ogt_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ogt <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fclt.w [[R3:\$w[0-9]+]], [[R2]], [[R1]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ogt_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @ogt_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: ogt_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ogt <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fclt.d [[R3:\$w[0-9]+]], [[R2]], [[R1]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ogt_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @ole_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ole_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ole <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcle.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ole_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @ole_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: ole_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ole <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcle.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ole_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @olt_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: olt_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp olt <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fclt.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size olt_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @olt_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: olt_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp olt <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fclt.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size olt_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @one_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> > %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: one_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp one <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcne.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size one_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @one_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> > double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: one_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp one <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcne.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size one_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @ord_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ord_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ord <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcor.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ord_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @ord_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> > double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: ord_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ord <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcor.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ord_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @ueq_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> > %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ueq_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ueq <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcueq.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ueq_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @ueq_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> > double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: ueq_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ueq <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcueq.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ueq_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @uge_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: uge_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp uge <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcule.w [[R3:\$w[0-9]+]], [[R2]], [[R1]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size uge_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @uge_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> > double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: uge_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp uge <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcule.d [[R3:\$w[0-9]+]], [[R2]], [[R1]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size uge_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @ugt_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ugt_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ugt <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcult.w [[R3:\$w[0-9]+]], [[R2]], [[R1]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ugt_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @ugt_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: ugt_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ugt <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcult.d [[R3:\$w[0-9]+]], [[R2]], [[R1]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ugt_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @ule_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ule_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ule <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcule.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ule_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @ule_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: ule_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ule <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcule.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ule_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @ult_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: ult_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ult <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcult.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ult_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @ult_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: ult_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp ult <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcult.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size ult_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @uno_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> > %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: uno_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  ; CHECK-DAG: ld.w [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  ; CHECK-DAG: ld.w [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp uno <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  ; CHECK-DAG: fcun.w [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ; CHECK-DAG: st.w [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size uno_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @uno_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> > double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: uno_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  ; CHECK-DAG: ld.d [[R1:\$w[0-9]+]], 0($5)
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  ; CHECK-DAG: ld.d [[R2:\$w[0-9]+]], 0($6)
>> > >> > +  %3 = fcmp uno <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  ; CHECK-DAG: fcun.d [[R3:\$w[0-9]+]], [[R1]], [[R2]]
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ; CHECK-DAG: st.d [[R3]], 0($4)
>> > >> > +
>> > >> > +  ret void
>> > >> > +  ; CHECK: .size uno_v2f64
>> > >> > +}
>> > >> > +
>> > >> > +define void @true_v4f32(<4 x i32>* %c, <4 x float>* %a, <4 x float>*
>> > %b)
>> > >> nounwind {
>> > >> > +  ; CHECK: true_v4f32:
>> > >> > +
>> > >> > +  %1 = load <4 x float>* %a
>> > >> > +  %2 = load <4 x float>* %b
>> > >> > +  %3 = fcmp true <4 x float> %1, %2
>> > >> > +  %4 = sext <4 x i1> %3 to <4 x i32>
>> > >> > +  store <4 x i32> %4, <4 x i32>* %c
>> > >> > +  ret void
>> > >> > +
>> > >> > +  ; (setcc $a, $b, SETTRUE) is always folded, so we won't get fcaf:
>> > >> > +  ; CHECK-DAG: ldi.b [[R1:\$w[0-9]+]], -1
>> > >> > +  ; CHECK-DAG: st.b [[R1]], 0($4)
>> > >> > +  ; CHECK: .size true_v4f32
>> > >> > +}
>> > >> > +
>> > >> > +define void @true_v2f64(<2 x i64>* %c, <2 x double>* %a, <2 x
>> > double>*
>> > >> %b) nounwind {
>> > >> > +  ; CHECK: true_v2f64:
>> > >> > +
>> > >> > +  %1 = load <2 x double>* %a
>> > >> > +  %2 = load <2 x double>* %b
>> > >> > +  %3 = fcmp true <2 x double> %1, %2
>> > >> > +  %4 = sext <2 x i1> %3 to <2 x i64>
>> > >> > +  store <2 x i64> %4, <2 x i64>* %c
>> > >> > +  ret void
>> > >> > +
>> > >> > +  ; FIXME: This code is correct, but poor. Ideally it would be similar to
>> > >> > +  ;        the code in @true_v4f32
>> > >> > +  ; CHECK-DAG: ldi.d [[R1:\$w[0-9]+]], 1
>> > >> > +  ; CHECK-DAG: slli.d [[R3:\$w[0-9]+]], [[R1]], 63
>> > >> > +  ; CHECK-DAG: srai.d [[R4:\$w[0-9]+]], [[R3]], 63
>> > >> > +  ; CHECK-DAG: st.d [[R4]], 0($4)
>> > >> > +  ; CHECK: .size true_v2f64
>> > >> > +}
>> > >> >
>> > >> >
>> > >> > _______________________________________________
>> > >> > llvm-commits mailing list
>> > >> > llvm-commits at cs.uiuc.edu
>> > >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits



More information about the llvm-commits mailing list