[llvm] r194542 - SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.

Tom Stellard tom at stellard.net
Wed Nov 20 19:35:40 PST 2013


On Tue, Nov 19, 2013 at 06:40:26PM -0800, Juergen Ributzka wrote:
> Hi Tom
> 
> Great, that makes things much easier. I rewrote the patch and it shouldn’t require any changes in the backend.
> Please let me know if this patch works for you.

cc'ing Owen, because I would like to get these patches into the 3.4
branch.

Juergen,

Your patch does fix the crash, but there is still a performance
regression in the test case I provided.  I have attached a patch to fix
this performance regression.

Owen,

There are two patches attached to this email that I would like to merge
into the 3.4 branch.  The first is a patch from Juergen which fixes a
crash in the R600 backend.  The crash is a regression caused by r194542.

Even with the crash fixed there is still a performance regression which
was uncovered by r194542.  This performance issue is caused by the
DAGLegalizer falling back to stack loads and stores when trying to
expand certain kinds of bitcasts.  The second patch improves the
handling of bitcasts in the legalizer and fixes this regression.

Thanks,
Tom


> 
> Thanks
> 
> Cheers,
> Juergen
> 
> 


> 
> On Nov 19, 2013, at 3:31 PM, Tom Stellard <tom at stellard.net> wrote:
> 
> > Hi Juergen,
> > 
> > Sorry the test case is only meant for VLIW chips, it should work with
> > these arguments to llc:
> > 
> > llc -march=r600 -mcpu=redwood vselect.ll
> > 
> > -Tom
> > 
> > 
> > On Tue, Nov 19, 2013 at 03:16:08PM -0800, Juergen Ributzka wrote:
> >> Hi Tom,
> >> 
> >> even after disabling my changes I get the following error from your new test case during instruction selection:
> >> 
> >> LLVM ERROR: Cannot select: 0x7ffc34825a10: i64 = select 0x7ffc34825910, 0x7ffc3480f510, 0x7ffc3480e210 [ORD=26] [ID=49]
> >>  0x7ffc34825910: i1 = setcc 0x7ffc34825810, 0x7ffc34825710, 0x7ffc3480db10 [ORD=25] [ID=45]
> >>    0x7ffc34825810: i32 = extract_vector_elt 0x7ffc34049810, 0x7ffc34049510 [ORD=25] [ID=41]
> >>      0x7ffc34049810: v2i32 = BUILD_VECTOR 0x7ffc34810c10, 0x7ffc3480e610 [ORD=25] [ID=37]
> >>        0x7ffc34810c10: i32 = extract_vector_elt 0x7ffc3480f110, 0x7ffc34049510 [ORD=24] [ID=32]
> >>          0x7ffc3480f110: v4i32,ch = load 0x7ffc33c160d8, 0x7ffc34046910, 0x7ffc34810210<LD16[undef]> [ORD=24] [ID=27]
> >>            0x7ffc34046910: i64 = add 0x7ffc34047310, 0x7ffc3480f410 [ORD=24] [ID=23]
> >>              0x7ffc34047310: i64,ch = CopyFromReg 0x7ffc33c160d8, 0x7ffc34049910 [ORD=24] [ID=21]
> >>                0x7ffc34049910: i64 = Register %vreg0 [ID=1]
> >>              0x7ffc3480f410: i64 = Constant<52> [ID=4]
> >>            0x7ffc34810210: i64 = undef [ID=3]
> >>          0x7ffc34049510: i32 = Constant<0> [ID=5]
> >>        0x7ffc3480e610: i32 = extract_vector_elt 0x7ffc3480f110, 0x7ffc34049a10 [ORD=24] [ID=31]
> >>          0x7ffc3480f110: v4i32,ch = load 0x7ffc33c160d8, 0x7ffc34046910, 0x7ffc34810210<LD16[undef]> [ORD=24] [ID=27]
> >>            0x7ffc34046910: i64 = add 0x7ffc34047310, 0x7ffc3480f410 [ORD=24] [ID=23]
> >>              0x7ffc34047310: i64,ch = CopyFromReg 0x7ffc33c160d8, 0x7ffc34049910 [ORD=24] [ID=21]
> >>                0x7ffc34049910: i64 = Register %vreg0 [ID=1]
> >>              0x7ffc3480f410: i64 = Constant<52> [ID=4]
> >>            0x7ffc34810210: i64 = undef [ID=3]
> >>          0x7ffc34049a10: i32 = Constant<1> [ID=17]
> >>      0x7ffc34049510: i32 = Constant<0> [ID=5]
> >>    0x7ffc34825710: i32 = extract_vector_elt 0x7ffc3480f310, 0x7ffc34049510 [ORD=25] [ID=26]
> >>      0x7ffc3480f310: v2i32 = BUILD_VECTOR 0x7ffc34049510, 0x7ffc34049510 [ORD=25] [ID=22]
> >>        0x7ffc34049510: i32 = Constant<0> [ID=5]
> >>        0x7ffc34049510: i32 = Constant<0> [ID=5]
> >>      0x7ffc34049510: i32 = Constant<0> [ID=5]
> >>  0x7ffc3480f510: i64 = Constant<0> [ID=7]
> >>  0x7ffc3480e210: i64 = Constant<4> [ID=11]
> >> In function: test_select_v4i64
> >> 
> >> Do you get the same result?
> >> 
> >> -Juergen
> >> 
> >> On Nov 18, 2013, at 5:51 PM, Juergen Ributzka <juergen at apple.com> wrote:
> >> 
> >>> Hi Tom,
> >>> 
> >>> sorry for not coming back to you earlier. I was working on a patch for this problem this weekend, but then I stumbled across another issue that was exposed while working on the patch. The two pending patches for SelectionDAG that I posted to the mailing list are in preparation for fixing this issue. I looked at your second patch and using SplitVecRes_SETCC would fix the problem, but it also splits the SETCC result vector again, instead of using the already split vectors provided by GetSplitVector. GetSplitVector is just a cache of already split vectors for that given operation. I would prefer not to duplicate split vector nodes if they already exist. The issue for your backend is that split never occurred and that is why the assertion is failing. I hope to get this fixed for you soon.
> >>> 
> >>> Cheers,
> >>> Juergen
> >>> 
> >>> On Nov 18, 2013, at 4:49 PM, Tom Stellard <tom at stellard.net> wrote:
> >>> 
> >>>> Hi Juergen,
> >>>> 
> >>>> These two patches fix the crash for me.  The first one is an R600 fix
> >>>> and the second is a fix for illegal SELECT with legal SETCC as its
> >>>> condition.  I'm not so sure patch number 2 is the right way to
> >>>> fix the bug, but it was the most simple fix I could come up with.
> >>>> 
> >>>> -Tom
> >>>> 
> >>>> 
> >>>> On Fri, Nov 15, 2013 at 10:45:11AM -0800, Juergen Ributzka wrote:
> >>>>> Hi Tom,
> >>>>> 
> >>>>> I took a quick look yesterday night and I didn?t see a quick fix either. I will look into it today.
> >>>>> 
> >>>>> Cheers,
> >>>>> Juergen
> >>>>> 
> >>>>> 
> >>>>> On Nov 14, 2013, at 8:45 PM, Tom Stellard <tom at stellard.net> wrote:
> >>>>> 
> >>>>>> On Wed, Nov 13, 2013 at 01:57:54AM -0000, Juergen Ributzka wrote:
> >>>>>>> Author: ributzka
> >>>>>>> Date: Tue Nov 12 19:57:54 2013
> >>>>>>> New Revision: 194542
> >>>>>>> 
> >>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=194542&view=rev
> >>>>>>> Log:
> >>>>>>> SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
> >>>>>>> 
> >>>>>>> This patch reapplies r193676 with an additional fix for the Hexagon backend. The
> >>>>>>> SystemZ backend has already been fixed by r194148.
> >>>>>>> 
> >>>>>>> The Type Legalizer recognizes that VSELECT needs to be split, because the type
> >>>>>>> is to wide for the given target. The same does not always apply to SETCC,
> >>>>>>> because less space is required to encode the result of a comparison. As a result
> >>>>>>> VSELECT is split and SETCC is unrolled into scalar comparisons.
> >>>>>>> 
> >>>>>>> This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
> >>>>>>> Combiner. If a matching pattern is found, then the result mask of SETCC is
> >>>>>>> promoted to the expected vector mask type for the given target. Now the type
> >>>>>>> legalizer will split both VSELECT and SETCC.
> >>>>>>> 
> >>>>>>> This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
> >>>>>>> pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.
> >>>>>>> 
> >>>>>>> Reviewed by Nadav
> >>>>>>> 
> >>>>>> 
> >>>>>> Hi Juergen,
> >>>>>> 
> >>>>>> This caused a regression on R600.  I have attached a patch to this email
> >>>>>> with a testcase.  See my comments below.
> >>>>>> 
> >>>>>>> Added:
> >>>>>>> llvm/trunk/test/CodeGen/X86/vec_split.ll
> >>>>>>> Modified:
> >>>>>>> llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> >>>>>>> llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
> >>>>>>> llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.h
> >>>>>>> llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> >>>>>>> 
> >>>>>>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> >>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=194542&r1=194541&r2=194542&view=diff
> >>>>>>> ==============================================================================
> >>>>>>> --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
> >>>>>>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Tue Nov 12 19:57:54 2013
> >>>>>>> @@ -4364,6 +4364,29 @@ SDValue DAGCombiner::visitVSELECT(SDNode
> >>>>>>>  }
> >>>>>>> }
> >>>>>>> 
> >>>>>>> +  // Treat SETCC as a vector mask and promote the result type based on the
> >>>>>>> +  // targets expected SETCC result type. This will ensure that SETCC and VSELECT
> >>>>>>> +  // are both split by the type legalizer. This is done to prevent the type
> >>>>>>> +  // legalizer from unrolling SETCC into scalar comparions.
> >>>>>>> +  EVT SelectVT = N->getValueType(0);
> >>>>>>> +  EVT MaskVT = getSetCCResultType(SelectVT);
> >>>>>>> +  assert(MaskVT.isVector() && "Expected a vector type.");
> >>>>>>> +  if (N0.getOpcode() == ISD::SETCC && N0.getValueType() != MaskVT) {
> >>>>>>> +    SDLoc MaskDL(N0);
> >>>>>>> +
> >>>>>>> +    // Extend the mask to the desired value type.
> >>>>>>> +    ISD::NodeType ExtendCode =
> >>>>>>> +      TargetLowering::getExtendForContent(TLI.getBooleanContents(true));
> >>>>>>> +    SDValue Mask = DAG.getNode(ExtendCode, MaskDL, MaskVT, N0);
> >>>>>>> +
> >>>>>>> +    AddToWorkList(Mask.getNode());
> >>>>>>> +
> >>>>>>> +    SDValue LHS = N->getOperand(1);
> >>>>>>> +    SDValue RHS = N->getOperand(2);
> >>>>>>> +
> >>>>>>> +    return DAG.getNode(ISD::VSELECT, DL, SelectVT, Mask, LHS, RHS);
> >>>>>>> +  }
> >>>>>>> +
> >>>>>>> return SDValue();
> >>>>>>> }
> >>>>>>> 
> >>>>>>> 
> >>>>>>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
> >>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp?rev=194542&r1=194541&r2=194542&view=diff
> >>>>>>> ==============================================================================
> >>>>>>> --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp (original)
> >>>>>>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp Tue Nov 12 19:57:54 2013
> >>>>>>> @@ -492,14 +492,19 @@ void DAGTypeLegalizer::SplitRes_SELECT(S
> >>>>>>> SDValue Cond = N->getOperand(0);
> >>>>>>> CL = CH = Cond;
> >>>>>>> if (Cond.getValueType().isVector()) {
> >>>>>>> -    assert(Cond.getValueType().getVectorElementType() == MVT::i1 &&
> >>>>>>> -           "Condition legalized before result?");
> >>>>>>> -    unsigned NumElements = Cond.getValueType().getVectorNumElements();
> >>>>>>> -    EVT VCondTy = EVT::getVectorVT(*DAG.getContext(), MVT::i1, NumElements / 2);
> >>>>>>> -    CL = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VCondTy, Cond,
> >>>>>>> -                     DAG.getConstant(0, TLI.getVectorIdxTy()));
> >>>>>>> -    CH = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VCondTy, Cond,
> >>>>>>> -                     DAG.getConstant(NumElements / 2, TLI.getVectorIdxTy()));
> >>>>>>> +    if (Cond.getOpcode() == ISD::SETCC) {
> >>>>>>> +      assert(Cond.getValueType() == getSetCCResultType(N->getValueType(0)) &&
> >>>>>>> +             "Condition has not been prepared for split!");
> >>>>>>> +      GetSplitVector(Cond, CL, CH);
> >>>>>> 
> >>>>>> Even with the adjustment to getSetCCResultType in the attached patch,
> >>>>>> when GetSplitVector() is called here, I am getting an assertion failure:
> >>>>>> 
> >>>>>> llc: LegalizeTypes.cpp:828: void llvm::DAGTypeLegalizer::GetSplitVector(llvm::SDValue, llvm::SDValue &, llvm::SDValue &): Assertion `Entry.first.getNode() && "Operand isn't split"' failed.
> >>>>>> 
> >>>>>> From what I can tell the problem here is that 'Cond' has a value type
> >>>>>> of v4i32 which is legal on R600, so the operation was not split, and
> >>>>>> therefore there is no entry in the SplitVectors map for it (This is what
> >>>>>> is causing the assertion failure).
> >>>>>> 
> >>>>>> The SELECT in this example has a type of v4i64 which is not a legal type
> >>>>>> on R600, so the problem occurs when a SELECT node has an illegal type,
> >>>>>> but Operand 0 is a SETCC node with a legal type.  I'm not sure the best
> >>>>>> way to fix this issue.  Do you have any suggestions?
> >>>>>> 
> >>>>>> Thanks,
> >>>>>> Tom
> >>>>>> 
> >>>>>> 
> >>>>>>> +    } else {
> >>>>>>> +      EVT ETy = Cond.getValueType().getVectorElementType();
> >>>>>>> +      unsigned NumElements = Cond.getValueType().getVectorNumElements();
> >>>>>>> +      EVT VCondTy = EVT::getVectorVT(*DAG.getContext(), ETy, NumElements / 2);
> >>>>>>> +      CL = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VCondTy, Cond,
> >>>>>>> +                       DAG.getConstant(0, TLI.getVectorIdxTy()));
> >>>>>>> +      CH = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VCondTy, Cond,
> >>>>>>> +                       DAG.getConstant(NumElements / 2, TLI.getVectorIdxTy()));
> >>>>>>> +    }
> >>>>>>> }
> >>>>>>> 
> >>>>>>> Lo = DAG.getNode(N->getOpcode(), dl, LL.getValueType(), CL, LL, RL);
> >>>>>>> 
> >>>>>>> Modified: llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.h
> >>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.h?rev=194542&r1=194541&r2=194542&view=diff
> >>>>>>> ==============================================================================
> >>>>>>> --- llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.h (original)
> >>>>>>> +++ llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.h Tue Nov 12 19:57:54 2013
> >>>>>>> @@ -141,8 +141,11 @@ namespace llvm {
> >>>>>>> 
> >>>>>>>  SDValue  LowerVASTART(SDValue Op, SelectionDAG &DAG) const;
> >>>>>>>  SDValue  LowerConstantPool(SDValue Op, SelectionDAG &DAG) const;
> >>>>>>> -    virtual EVT getSetCCResultType(LLVMContext &, EVT) const {
> >>>>>>> -      return MVT::i1;
> >>>>>>> +    virtual EVT getSetCCResultType(LLVMContext &C, EVT VT) const {
> >>>>>>> +      if (!VT.isVector())
> >>>>>>> +        return MVT::i1;
> >>>>>>> +      else
> >>>>>>> +        return EVT::getVectorVT(C, MVT::i1, VT.getVectorNumElements());
> >>>>>>>  }
> >>>>>>> 
> >>>>>>>  virtual bool getPostIndexedAddressParts(SDNode *N, SDNode *Op,
> >>>>>>> 
> >>>>>>> Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> >>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=194542&r1=194541&r2=194542&view=diff
> >>>>>>> ==============================================================================
> >>>>>>> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> >>>>>>> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Nov 12 19:57:54 2013
> >>>>>>> @@ -1547,7 +1547,16 @@ void X86TargetLowering::resetOperationAc
> >>>>>>> }
> >>>>>>> 
> >>>>>>> EVT X86TargetLowering::getSetCCResultType(LLVMContext &, EVT VT) const {
> >>>>>>> -  if (!VT.isVector()) return MVT::i8;
> >>>>>>> +  if (!VT.isVector())
> >>>>>>> +    return MVT::i8;
> >>>>>>> +
> >>>>>>> +  const TargetMachine &TM = getTargetMachine();
> >>>>>>> +  if (!TM.Options.UseSoftFloat && Subtarget->hasAVX512())
> >>>>>>> +    switch(VT.getVectorNumElements()) {
> >>>>>>> +    case  8: return MVT::v8i1;
> >>>>>>> +    case 16: return MVT::v16i1;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> return VT.changeVectorElementTypeToInteger();
> >>>>>>> }
> >>>>>>> 
> >>>>>>> 
> >>>>>>> Added: llvm/trunk/test/CodeGen/X86/vec_split.ll
> >>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_split.ll?rev=194542&view=auto
> >>>>>>> ==============================================================================
> >>>>>>> --- llvm/trunk/test/CodeGen/X86/vec_split.ll (added)
> >>>>>>> +++ llvm/trunk/test/CodeGen/X86/vec_split.ll Tue Nov 12 19:57:54 2013
> >>>>>>> @@ -0,0 +1,42 @@
> >>>>>>> +; RUN: llc -march=x86-64 -mcpu=corei7 < %s | FileCheck %s -check-prefix=SSE4
> >>>>>>> +; RUN: llc -march=x86-64 -mcpu=corei7-avx < %s | FileCheck %s -check-prefix=AVX1
> >>>>>>> +; RUN: llc -march=x86-64 -mcpu=core-avx2 < %s | FileCheck %s -check-prefix=AVX2
> >>>>>>> +
> >>>>>>> +define <16 x i16> @split16(<16 x i16> %a, <16 x i16> %b, <16 x i8> %__mask) {
> >>>>>>> +; SSE4-LABEL: split16:
> >>>>>>> +; SSE4: pminuw
> >>>>>>> +; SSE4: pminuw
> >>>>>>> +; SSE4: ret
> >>>>>>> +; AVX1-LABEL: split16:
> >>>>>>> +; AVX1: vpminuw
> >>>>>>> +; AVX1: vpminuw
> >>>>>>> +; AVX1: ret
> >>>>>>> +; AVX2-LABEL: split16:
> >>>>>>> +; AVX2: vpminuw
> >>>>>>> +; AVX2: ret
> >>>>>>> +  %1 = icmp ult <16 x i16> %a, %b
> >>>>>>> +  %2 = select <16 x i1> %1, <16 x i16> %a, <16 x i16> %b
> >>>>>>> +  ret <16 x i16> %2
> >>>>>>> +}
> >>>>>>> +
> >>>>>>> +define <32 x i16> @split32(<32 x i16> %a, <32 x i16> %b, <32 x i8> %__mask) {
> >>>>>>> +; SSE4-LABEL: split32:
> >>>>>>> +; SSE4: pminuw
> >>>>>>> +; SSE4: pminuw
> >>>>>>> +; SSE4: pminuw
> >>>>>>> +; SSE4: pminuw
> >>>>>>> +; SSE4: ret
> >>>>>>> +; AVX1-LABEL: split32:
> >>>>>>> +; AVX1: vpminuw
> >>>>>>> +; AVX1: vpminuw
> >>>>>>> +; AVX1: vpminuw
> >>>>>>> +; AVX1: vpminuw
> >>>>>>> +; AVX1: ret
> >>>>>>> +; AVX2-LABEL: split32:
> >>>>>>> +; AVX2: vpminuw
> >>>>>>> +; AVX2: vpminuw
> >>>>>>> +; AVX2: ret
> >>>>>>> +  %1 = icmp ult <32 x i16> %a, %b
> >>>>>>> +  %2 = select <32 x i1> %1, <32 x i16> %a, <32 x i16> %b
> >>>>>>> +  ret <32 x i16> %2
> >>>>>>> +}
> >>>>>>> 
> >>>>>>> 
> >>>>>>> _______________________________________________
> >>>>>>> llvm-commits mailing list
> >>>>>>> llvm-commits at cs.uiuc.edu
> >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>>>>> <0001-R600-Fix-getSetCCResultType-to-handle-new-VSELECT-DA.patch>
> >>>>> 
> >>>> <0001-R600-Fix-getSetCCResultType-to-handle-new-VSELECT-DA.patch><0002-SelectionDAG-Fix-assertion-failure-when-splitting-ve.patch>
> >>> 
> >>> 
> >>> _______________________________________________
> >>> llvm-commits mailing list
> >>> llvm-commits at cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> 
> 

-------------- next part --------------
>From c2754e0cd5f018e83ddce0947705c17460c10167 Mon Sep 17 00:00:00 2001
From: Juergen Ributzka <juergen at apple.com>
Date: Tue, 19 Nov 2013 18:25:20 -0800
Subject: [PATCH 1/2] Split SETCC if VSELECT requires splitting too.

This patch is a rewrite of the original patch commited in r194542. Instead of
relying on the type legalizer to do the splitting for us, we now peform the
splitting ourselves in the DAG combiner. This is necessary for the case where
the vector mask is a legal type after promotion and still wouldn't require
splitting.
---
 include/llvm/CodeGen/SelectionDAG.h               |  3 +-
 lib/CodeGen/SelectionDAG/DAGCombiner.cpp          | 56 ++++++++++++++++-------
 lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp |  8 ++--
 3 files changed, 45 insertions(+), 22 deletions(-)

diff --git a/include/llvm/CodeGen/SelectionDAG.h b/include/llvm/CodeGen/SelectionDAG.h
index 4b3f904..82becca 100644
--- a/include/llvm/CodeGen/SelectionDAG.h
+++ b/include/llvm/CodeGen/SelectionDAG.h
@@ -1149,7 +1149,8 @@ public:
 
   /// SplitVectorOperand - Split the node's operand with EXTRACT_SUBVECTOR and
   /// return the low/high part.
-  std::pair<SDValue, SDValue> SplitVectorOperand(SDNode *N, unsigned OpNo) {
+  std::pair<SDValue, SDValue> SplitVectorOperand(const SDNode *N, unsigned OpNo)
+  {
     return SplitVector(N->getOperand(OpNo), SDLoc(N));
   }
 
diff --git a/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 15b4ddc..20b0981 100644
--- a/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -4327,6 +4327,23 @@ SDValue DAGCombiner::visitSELECT(SDNode *N) {
   return SDValue();
 }
 
+static
+std::pair<SDValue, SDValue> SplitVSETCC(const SDNode *N, SelectionDAG &DAG) {
+  SDLoc DL(N);
+  EVT LoVT, HiVT;
+  llvm::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));
+
+  // Split the inputs.
+  SDValue Lo, Hi, LL, LH, RL, RH;
+  llvm::tie(LL, LH) = DAG.SplitVectorOperand(N, 0);
+  llvm::tie(RL, RH) = DAG.SplitVectorOperand(N, 1);
+
+  Lo = DAG.getNode(N->getOpcode(), DL, LoVT, LL, RL, N->getOperand(2));
+  Hi = DAG.getNode(N->getOpcode(), DL, HiVT, LH, RH, N->getOperand(2));
+
+  return std::make_pair(Lo, Hi);
+}
+
 SDValue DAGCombiner::visitVSELECT(SDNode *N) {
   SDValue N0 = N->getOperand(0);
   SDValue N1 = N->getOperand(1);
@@ -4364,27 +4381,32 @@ SDValue DAGCombiner::visitVSELECT(SDNode *N) {
     }
   }
 
-  // Treat SETCC as a vector mask and promote the result type based on the
-  // targets expected SETCC result type. This will ensure that SETCC and VSELECT
-  // are both split by the type legalizer. This is done to prevent the type
-  // legalizer from unrolling SETCC into scalar comparions.
-  EVT SelectVT = N->getValueType(0);
-  EVT MaskVT = getSetCCResultType(SelectVT);
-  assert(MaskVT.isVector() && "Expected a vector type.");
-  if (N0.getOpcode() == ISD::SETCC && N0.getValueType() != MaskVT) {
-    SDLoc MaskDL(N0);
+  // If the VSELECT result requires splitting and the mask is provided by a
+  // SETCC, then split both nodes and its operands before legalization. This
+  // prevents the type legalizer from unrolling SETCC into scalar comparisons
+  // and enables future optimizations (e.g. min/max pattern matching on X86).
+  if (N0.getOpcode() == ISD::SETCC) {
+    EVT VT = N->getValueType(0);
+
+    // Check if any splitting is required.
+    if (TLI.getTypeAction(*DAG.getContext(), VT) !=
+        TargetLowering::TypeSplitVector)
+      return SDValue();
 
-    // Extend the mask to the desired value type.
-    ISD::NodeType ExtendCode =
-      TargetLowering::getExtendForContent(TLI.getBooleanContents(true));
-    SDValue Mask = DAG.getNode(ExtendCode, MaskDL, MaskVT, N0);
+    SDValue Lo, Hi, CCLo, CCHi, LL, LH, RL, RH;
+    llvm::tie(CCLo, CCHi) = SplitVSETCC(N0.getNode(), DAG);
+    llvm::tie(LL, LH) = DAG.SplitVectorOperand(N, 1);
+    llvm::tie(RL, RH) = DAG.SplitVectorOperand(N, 2);
 
-    AddToWorkList(Mask.getNode());
+    Lo = DAG.getNode(N->getOpcode(), DL, LL.getValueType(), CCLo, LL, RL);
+    Hi = DAG.getNode(N->getOpcode(), DL, LH.getValueType(), CCHi, LH, RH);
 
-    SDValue LHS = N->getOperand(1);
-    SDValue RHS = N->getOperand(2);
+    // Add the new VSELECT nodes to the work list in case they need to be split
+    // again.
+    AddToWorkList(Lo.getNode());
+    AddToWorkList(Hi.getNode());
 
-    return DAG.getNode(ISD::VSELECT, DL, SelectVT, Mask, LHS, RHS);
+    return DAG.getNode(ISD::CONCAT_VECTORS, DL, VT, Lo, Hi);
   }
 
   return SDValue();
diff --git a/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp b/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
index 9e97c49..b3174fd 100644
--- a/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
+++ b/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
@@ -488,11 +488,11 @@ void DAGTypeLegalizer::SplitRes_SELECT(SDNode *N, SDValue &Lo,
   SDValue Cond = N->getOperand(0);
   CL = CH = Cond;
   if (Cond.getValueType().isVector()) {
-    if (Cond.getOpcode() == ISD::SETCC) {
-      assert(Cond.getValueType() == getSetCCResultType(N->getValueType(0)) &&
-             "Condition has not been prepared for split!");
+    // Check if there are already splitted versions of the vector available and
+    // use those instead of splitting the mask operand again.
+    if (getTypeAction(Cond.getValueType()) == TargetLowering::TypeSplitVector)
       GetSplitVector(Cond, CL, CH);
-    } else
+    else
       llvm::tie(CL, CH) = DAG.SplitVector(Cond, dl);
   }
 
-- 
1.8.1.4

-------------- next part --------------
>From 6214abfc4b653076dcd13e70b9f9049bcbfa5d62 Mon Sep 17 00:00:00 2001
From: Tom Stellard <thomas.stellard at amd.com>
Date: Wed, 20 Nov 2013 17:08:17 -0800
Subject: [PATCH 2/2] SelectionDAG: Optimize expansion of vec_type = BITCAST
 scalar_type

The legalizer can now do this type of expansion for more
type combinations without loading and storing to and
from the stack.
---
 lib/CodeGen/SelectionDAG/LegalizeTypes.h          |  6 +++
 lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp | 45 ++++++++++++++++++-----
 test/CodeGen/R600/vselect64.ll                    | 15 ++++++++
 3 files changed, 56 insertions(+), 10 deletions(-)
 create mode 100644 test/CodeGen/R600/vselect64.ll

diff --git a/lib/CodeGen/SelectionDAG/LegalizeTypes.h b/lib/CodeGen/SelectionDAG/LegalizeTypes.h
index 7a03f85..13bb08f 100644
--- a/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+++ b/lib/CodeGen/SelectionDAG/LegalizeTypes.h
@@ -731,6 +731,12 @@ private:
       GetExpandedFloat(Op, Lo, Hi);
   }
 
+
+  /// This function will split the integer \p Op into \p NumElements
+  /// operations of type \p EltVT and store them in \p Ops.
+  void IntegerToVector(SDValue Op, unsigned NumElements,
+                       SmallVectorImpl<SDValue> &Ops, EVT EltVT);
+
   // Generic Result Expansion.
   void ExpandRes_MERGE_VALUES      (SDNode *N, unsigned ResNo,
                                     SDValue &Lo, SDValue &Hi);
diff --git a/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp b/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
index b3174fd..c749fde 100644
--- a/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
+++ b/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
@@ -306,6 +306,25 @@ void DAGTypeLegalizer::ExpandRes_VAARG(SDNode *N, SDValue &Lo, SDValue &Hi) {
 // Generic Operand Expansion.
 //===--------------------------------------------------------------------===//
 
+void DAGTypeLegalizer::IntegerToVector(SDValue Op, unsigned NumElements,
+                                       SmallVectorImpl<SDValue> &Ops,
+                                       EVT EltVT) {
+  assert(Op.getValueType().isInteger());
+  SDLoc DL(Op);
+  SDValue Parts[2];
+
+  if (NumElements > 1) {
+    NumElements >>= 1;
+    SplitInteger(Op, Parts[0], Parts[1]);
+      if (TLI.isBigEndian())
+        std::swap(Parts[0], Parts[1]);
+    IntegerToVector(Parts[0], NumElements, Ops, EltVT);
+    IntegerToVector(Parts[1], NumElements, Ops, EltVT);
+  } else {
+    Ops.push_back(DAG.getNode(ISD::BITCAST, DL, EltVT, Op));
+  }
+}
+
 SDValue DAGTypeLegalizer::ExpandOp_BITCAST(SDNode *N) {
   SDLoc dl(N);
   if (N->getValueType(0).isVector()) {
@@ -314,21 +333,27 @@ SDValue DAGTypeLegalizer::ExpandOp_BITCAST(SDNode *N) {
     // instead, but only if the new vector type is legal (otherwise there
     // is no point, and it might create expansion loops).  For example, on
     // x86 this turns v1i64 = BITCAST i64 into v1i64 = BITCAST v2i32.
+    //
+    // FIXME: I'm not sure why we are first trying to split the input into
+    // a 2 element vector, so I'm leaving it here to maintain the current
+    // behavior.
+    unsigned NumElts = 2;
     EVT OVT = N->getOperand(0).getValueType();
     EVT NVT = EVT::getVectorVT(*DAG.getContext(),
                                TLI.getTypeToTransformTo(*DAG.getContext(), OVT),
-                               2);
-
-    if (isTypeLegal(NVT)) {
-      SDValue Parts[2];
-      GetExpandedOp(N->getOperand(0), Parts[0], Parts[1]);
+                               NumElts);
+    if (!isTypeLegal(NVT)) {
+      // If we can't find a legal type by splitting the integer in half,
+      // then we can use the node's value type.
+      NumElts = N->getValueType(0).getVectorNumElements();
+      NVT = N->getValueType(0);
+    }
 
-      if (TLI.isBigEndian())
-        std::swap(Parts[0], Parts[1]);
+    SmallVector<SDValue, 8> Ops;
+    IntegerToVector(N->getOperand(0), NumElts, Ops, NVT.getVectorElementType());
 
-      SDValue Vec = DAG.getNode(ISD::BUILD_VECTOR, dl, NVT, Parts, 2);
-      return DAG.getNode(ISD::BITCAST, dl, N->getValueType(0), Vec);
-    }
+    SDValue Vec = DAG.getNode(ISD::BUILD_VECTOR, dl, NVT, &Ops[0], NumElts);
+    return DAG.getNode(ISD::BITCAST, dl, N->getValueType(0), Vec);
   }
 
   // Otherwise, store to a temporary and load out again as the new type.
diff --git a/test/CodeGen/R600/vselect64.ll b/test/CodeGen/R600/vselect64.ll
new file mode 100644
index 0000000..604695b
--- /dev/null
+++ b/test/CodeGen/R600/vselect64.ll
@@ -0,0 +1,15 @@
+; RUN: llc < %s -march=r600 -mcpu=redwood | FileCheck  %s
+; XXX: Merge this test into vselect.ll once SI supports 64-bit select.
+
+; CHECK-LABEL: @test_select_v4i64
+; Make sure the vectors aren't being stored on the stack.  We know they are
+; being stored on the stack if the shaders uses at leat 10 registers.
+; CHECK-NOT: {{\**}} MOV T{{[0-9][0-9]}}.X
+define void @test_select_v4i64(<4 x i64> addrspace(1)* %out, <4 x i32> %c) {
+entry:
+       %cmp = icmp ne  <4 x i32> %c, <i32 0, i32 0, i32 0, i32 0>
+       %result = select <4 x i1> %cmp, <4 x i64> <i64 0, i64 1, i64 2, i64 3>, <4 x i64> <i64 4, i64 5, i64 6, i64 7>
+       store <4 x i64> %result, <4 x i64> addrspace(1)* %out
+       ret void
+}
+
-- 
1.8.1.4



More information about the llvm-commits mailing list