[llvm] r211478 - R600: Use LowerSDIVREM for i64 node replace

Jan Vesely jan.vesely at rutgers.edu
Tue Jun 24 05:35:53 PDT 2014


On Tue, 2014-06-24 at 00:46 -0700, Matt Arsenault wrote:
> On Jun 22, 2014, at 2:43 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> 
> > Author: jvesely
> > Date: Sun Jun 22 16:43:01 2014
> > New Revision: 211478
> > 
> > URL: http://llvm.org/viewvc/llvm-project?rev=211478&view=rev
> > Log:
> > R600: Use LowerSDIVREM for i64 node replace
> > 
> > v2: move div/rem node replacement to R600ISelLowering
> >    make lowerSDIVREM protected
> > 
> > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > 
> > Modified:
> >    llvm/trunk/lib/Target/R600/AMDGPUISelLowering.cpp
> >    llvm/trunk/lib/Target/R600/AMDGPUISelLowering.h
> >    llvm/trunk/lib/Target/R600/R600ISelLowering.cpp
> > 
> > Modified: llvm/trunk/lib/Target/R600/AMDGPUISelLowering.cpp
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/R600/AMDGPUISelLowering.cpp?rev=211478&r1=211477&r2=211478&view=diff
> > ==============================================================================
> > --- llvm/trunk/lib/Target/R600/AMDGPUISelLowering.cpp (original)
> > +++ llvm/trunk/lib/Target/R600/AMDGPUISelLowering.cpp Sun Jun 22 16:43:01 2014
> > @@ -530,96 +530,6 @@ void AMDGPUTargetLowering::ReplaceNodeRe
> >     // ReplaceNodeResults to sext_in_reg to an illegal type, so we'll just do
> >     // nothing here and let the illegal result integer be handled normally.
> >     return;
> > -  case ISD::UDIV: {
> > -    SDValue Op = SDValue(N, 0);
> > -    SDLoc DL(Op);
> > -    EVT VT = Op.getValueType();
> > -    SDValue UDIVREM = DAG.getNode(ISD::UDIVREM, DL, DAG.getVTList(VT, VT),
> > -      N->getOperand(0), N->getOperand(1));
> > -    Results.push_back(UDIVREM);
> > -    break;
> > -  }
> > -  case ISD::UREM: {
> > -    SDValue Op = SDValue(N, 0);
> > -    SDLoc DL(Op);
> > -    EVT VT = Op.getValueType();
> > -    SDValue UDIVREM = DAG.getNode(ISD::UDIVREM, DL, DAG.getVTList(VT, VT),
> > -      N->getOperand(0), N->getOperand(1));
> > -    Results.push_back(UDIVREM.getValue(1));
> > -    break;
> > -  }
> > -  case ISD::UDIVREM: {
> > -    SDValue Op = SDValue(N, 0);
> > -    SDLoc DL(Op);
> > -    EVT VT = Op.getValueType();
> > -    EVT HalfVT = VT.getHalfSizedIntegerVT(*DAG.getContext());
> > -
> > -    SDValue one = DAG.getConstant(1, HalfVT);
> > -    SDValue zero = DAG.getConstant(0, HalfVT);
> > -
> > -    //HiLo split
> > -    SDValue LHS = N->getOperand(0);
> > -    SDValue LHS_Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, LHS, zero);
> > -    SDValue LHS_Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, LHS, one);
> > -
> > -    SDValue RHS = N->getOperand(1);
> > -    SDValue RHS_Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, RHS, zero);
> > -    SDValue RHS_Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, RHS, one);
> > -
> > -    // Get Speculative values
> > -    SDValue DIV_Part = DAG.getNode(ISD::UDIV, DL, HalfVT, LHS_Hi, RHS_Lo);
> > -    SDValue REM_Part = DAG.getNode(ISD::UREM, DL, HalfVT, LHS_Hi, RHS_Lo);
> > -
> > -    SDValue REM_Hi = zero;
> > -    SDValue REM_Lo = DAG.getSelectCC(DL, RHS_Hi, zero, REM_Part, LHS_Hi, ISD::SETEQ);
> > -
> > -    SDValue DIV_Hi = DAG.getSelectCC(DL, RHS_Hi, zero, DIV_Part, zero, ISD::SETEQ);
> > -    SDValue DIV_Lo = zero;
> > -
> > -    const unsigned halfBitWidth = HalfVT.getSizeInBits();
> > -
> > -    for (unsigned i = 0; i < halfBitWidth; ++i) {
> > -      SDValue POS = DAG.getConstant(halfBitWidth - i - 1, HalfVT);
> > -      // Get Value of high bit
> > -      SDValue HBit;
> > -      if (halfBitWidth == 32 && Subtarget->hasBFE()) {
> > -        HBit = DAG.getNode(AMDGPUISD::BFE_U32, DL, HalfVT, LHS_Lo, POS, one);
> > -      } else {
> > -        HBit = DAG.getNode(ISD::SRL, DL, HalfVT, LHS_Lo, POS);
> > -        HBit = DAG.getNode(ISD::AND, DL, HalfVT, HBit, one);
> > -      }
> > -
> > -      SDValue Carry = DAG.getNode(ISD::SRL, DL, HalfVT, REM_Lo,
> > -        DAG.getConstant(halfBitWidth - 1, HalfVT));
> > -      REM_Hi = DAG.getNode(ISD::SHL, DL, HalfVT, REM_Hi, one);
> > -      REM_Hi = DAG.getNode(ISD::OR, DL, HalfVT, REM_Hi, Carry);
> > -
> > -      REM_Lo = DAG.getNode(ISD::SHL, DL, HalfVT, REM_Lo, one);
> > -      REM_Lo = DAG.getNode(ISD::OR, DL, HalfVT, REM_Lo, HBit);
> > -
> > -
> > -      SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, REM_Hi);
> > -
> > -      SDValue BIT = DAG.getConstant(1 << (halfBitWidth - i - 1), HalfVT);
> > -      SDValue realBIT = DAG.getSelectCC(DL, REM, RHS, BIT, zero, ISD::SETGE);
> > -
> > -      DIV_Lo = DAG.getNode(ISD::OR, DL, HalfVT, DIV_Lo, realBIT);
> > -
> > -      // Update REM
> > -
> > -      SDValue REM_sub = DAG.getNode(ISD::SUB, DL, VT, REM, RHS);
> > -
> > -      REM = DAG.getSelectCC(DL, REM, RHS, REM_sub, REM, ISD::SETGE);
> > -      REM_Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, REM, zero);
> > -      REM_Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, REM, one);
> > -    }
> > -
> > -    SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, REM_Hi);
> > -    SDValue DIV = DAG.getNode(ISD::BUILD_PAIR, DL, VT, DIV_Lo, DIV_Hi);
> > -    Results.push_back(DIV);
> > -    Results.push_back(REM);
> > -    break;
> > -  }
> >   default:
> >     return;
> >   }
> > 
> > Modified: llvm/trunk/lib/Target/R600/AMDGPUISelLowering.h
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/R600/AMDGPUISelLowering.h?rev=211478&r1=211477&r2=211478&view=diff
> > ==============================================================================
> > --- llvm/trunk/lib/Target/R600/AMDGPUISelLowering.h (original)
> > +++ llvm/trunk/lib/Target/R600/AMDGPUISelLowering.h Sun Jun 22 16:43:01 2014
> > @@ -50,7 +50,6 @@ private:
> >   SDValue LowerSREM(SDValue Op, SelectionDAG &DAG) const;
> >   SDValue LowerSREM32(SDValue Op, SelectionDAG &DAG) const;
> >   SDValue LowerSREM64(SDValue Op, SelectionDAG &DAG) const;
> > -  SDValue LowerSDIVREM(SDValue Op, SelectionDAG &DAG) const;
> >   SDValue LowerUDIVREM(SDValue Op, SelectionDAG &DAG) const;
> >   SDValue LowerFCEIL(SDValue Op, SelectionDAG &DAG) const;
> >   SDValue LowerFTRUNC(SDValue Op, SelectionDAG &DAG) const;
> > @@ -83,6 +82,7 @@ protected:
> >   SDValue SplitVectorStore(SDValue Op, SelectionDAG &DAG) const;
> >   SDValue LowerLOAD(SDValue Op, SelectionDAG &DAG) const;
> >   SDValue LowerSTORE(SDValue Op, SelectionDAG &DAG) const;
> > +  SDValue LowerSDIVREM(SDValue Op, SelectionDAG &DAG) const;
> >   bool isHWTrueValue(SDValue Op) const;
> >   bool isHWFalseValue(SDValue Op) const;
> > 
> > 
> > Modified: llvm/trunk/lib/Target/R600/R600ISelLowering.cpp
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/R600/R600ISelLowering.cpp?rev=211478&r1=211477&r2=211478&view=diff
> > ==============================================================================
> > --- llvm/trunk/lib/Target/R600/R600ISelLowering.cpp (original)
> > +++ llvm/trunk/lib/Target/R600/R600ISelLowering.cpp Sun Jun 22 16:43:01 2014
> > @@ -156,6 +156,8 @@ R600TargetLowering::R600TargetLowering(T
> >   // during Type Legalization
> >   setOperationAction(ISD::UDIV, MVT::i64, Custom);
> >   setOperationAction(ISD::UREM, MVT::i64, Custom);
> > +  setOperationAction(ISD::SDIV, MVT::i64, Custom);
> > +  setOperationAction(ISD::SREM, MVT::i64, Custom);
> > 
> >   // We don't have 64-bit shifts. Thus we need either SHX i64 or SHX_PARTS i32
> >   //  to be Legal/Custom in order to avoid library calls.
> > @@ -826,11 +828,127 @@ void R600TargetLowering::ReplaceNodeResu
> >     DAG.ReplaceAllUsesOfValueWith(SDValue(N,1), SDValue(Node, 1));
> >     return;
> >   }
> > -  case ISD::STORE:
> > +  case ISD::STORE: {
> >     SDNode *Node = LowerSTORE(SDValue(N, 0), DAG).getNode();
> >     Results.push_back(SDValue(Node, 0));
> >     return;
> >   }
> > +  case ISD::UDIV: {
> > +    SDValue Op = SDValue(N, 0);
> > +    SDLoc DL(Op);
> > +    EVT VT = Op.getValueType();
> > +    SDValue UDIVREM = DAG.getNode(ISD::UDIVREM, DL, DAG.getVTList(VT, VT),
> > +      N->getOperand(0), N->getOperand(1));
> > +    Results.push_back(UDIVREM);
> > +    break;
> > +  }
> > +  case ISD::UREM: {
> > +    SDValue Op = SDValue(N, 0);
> > +    SDLoc DL(Op);
> > +    EVT VT = Op.getValueType();
> > +    SDValue UDIVREM = DAG.getNode(ISD::UDIVREM, DL, DAG.getVTList(VT, VT),
> > +      N->getOperand(0), N->getOperand(1));
> > +    Results.push_back(UDIVREM.getValue(1));
> > +    break;
> > +  }
> > +  case ISD::SDIV: {
> > +    SDValue Op = SDValue(N, 0);
> > +    SDLoc DL(Op);
> > +    EVT VT = Op.getValueType();
> > +    SDValue SDIVREM = DAG.getNode(ISD::SDIVREM, DL, DAG.getVTList(VT, VT),
> > +      N->getOperand(0), N->getOperand(1));
> > +    Results.push_back(SDIVREM);
> > +    break;
> > +  }
> > +  case ISD::SREM: {
> > +    SDValue Op = SDValue(N, 0);
> > +    SDLoc DL(Op);
> > +    EVT VT = Op.getValueType();
> > +    SDValue SDIVREM = DAG.getNode(ISD::SDIVREM, DL, DAG.getVTList(VT, VT),
> > +      N->getOperand(0), N->getOperand(1));
> > +    Results.push_back(SDIVREM.getValue(1));
> > +    break;
> > +  }
> > +  case ISD::SDIVREM: {
> > +    SDValue Op = SDValue(N, 1);
> > +    SDValue RES = LowerSDIVREM(Op, DAG);
> > +    Results.push_back(RES);
> > +    Results.push_back(RES.getValue(1));
> > +    break;
> > +  }
> > +  case ISD::UDIVREM: {
> > +    SDValue Op = SDValue(N, 0);
> > +    SDLoc DL(Op);
> > +    EVT VT = Op.getValueType();
> > +    EVT HalfVT = VT.getHalfSizedIntegerVT(*DAG.getContext());
> > +
> > +    SDValue one = DAG.getConstant(1, HalfVT);
> > +    SDValue zero = DAG.getConstant(0, HalfVT);
> > +
> > +    //HiLo split
> > +    SDValue LHS = N->getOperand(0);
> > +    SDValue LHS_Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, LHS, zero);
> > +    SDValue LHS_Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, LHS, one);
> > +
> > +    SDValue RHS = N->getOperand(1);
> > +    SDValue RHS_Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, RHS, zero);
> > +    SDValue RHS_Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, RHS, one);
> > +
> > +    // Get Speculative values
> > +    SDValue DIV_Part = DAG.getNode(ISD::UDIV, DL, HalfVT, LHS_Hi, RHS_Lo);
> > +    SDValue REM_Part = DAG.getNode(ISD::UREM, DL, HalfVT, LHS_Hi, RHS_Lo);
> > +
> > +    SDValue REM_Hi = zero;
> > +    SDValue REM_Lo = DAG.getSelectCC(DL, RHS_Hi, zero, REM_Part, LHS_Hi, ISD::SETEQ);
> > +
> > +    SDValue DIV_Hi = DAG.getSelectCC(DL, RHS_Hi, zero, DIV_Part, zero, ISD::SETEQ);
> > +    SDValue DIV_Lo = zero;
> > +
> > +    const unsigned halfBitWidth = HalfVT.getSizeInBits();
> > +
> > +    for (unsigned i = 0; i < halfBitWidth; ++i) {
> > +      SDValue POS = DAG.getConstant(halfBitWidth - i - 1, HalfVT);
> > +      // Get Value of high bit
> > +      SDValue HBit;
> > +      if (halfBitWidth == 32 && Subtarget->hasBFE()) {
> > +        HBit = DAG.getNode(AMDGPUISD::BFE_U32, DL, HalfVT, LHS_Lo, POS, one);
> > +      } else {
> > +        HBit = DAG.getNode(ISD::SRL, DL, HalfVT, LHS_Lo, POS);
> > +        HBit = DAG.getNode(ISD::AND, DL, HalfVT, HBit, one);
> > +      }
> > +
> > +      SDValue Carry = DAG.getNode(ISD::SRL, DL, HalfVT, REM_Lo,
> > +        DAG.getConstant(halfBitWidth - 1, HalfVT));
> > +      REM_Hi = DAG.getNode(ISD::SHL, DL, HalfVT, REM_Hi, one);
> > +      REM_Hi = DAG.getNode(ISD::OR, DL, HalfVT, REM_Hi, Carry);
> > +
> > +      REM_Lo = DAG.getNode(ISD::SHL, DL, HalfVT, REM_Lo, one);
> > +      REM_Lo = DAG.getNode(ISD::OR, DL, HalfVT, REM_Lo, HBit);
> > +
> > +
> > +      SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, REM_Hi);
> > +
> > +      SDValue BIT = DAG.getConstant(1 << (halfBitWidth - i - 1), HalfVT);
> > +      SDValue realBIT = DAG.getSelectCC(DL, REM, RHS, BIT, zero, ISD::SETGE);
> > +
> > +      DIV_Lo = DAG.getNode(ISD::OR, DL, HalfVT, DIV_Lo, realBIT);
> > +
> > +      // Update REM
> > +
> > +      SDValue REM_sub = DAG.getNode(ISD::SUB, DL, VT, REM, RHS);
> > +
> > +      REM = DAG.getSelectCC(DL, REM, RHS, REM_sub, REM, ISD::SETGE);
> > +      REM_Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, REM, zero);
> > +      REM_Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, REM, one);
> > +    }
> > +
> > +    SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, REM_Hi);
> > +    SDValue DIV = DAG.getNode(ISD::BUILD_PAIR, DL, VT, DIV_Lo, DIV_Hi);
> > +    Results.push_back(DIV);
> > +    Results.push_back(REM);
> > +    break;
> > +  }
> > +  }
> > }
> > 
> > SDValue R600TargetLowering::vectorToVerticalVector(SelectionDAG &DAG,
> 
> 
> This seems to not be working for i64 sdiv on SI. The umulo tries to
> expand to an i128 mul libcall. I’m not really sure why this is working
> on R600

That's because i64 UDIVREM is not working on SI. The algorithm in
LowerUDIVREM() needs few more opcodes (like multiplication) to work on
SI i64. The udivrem64 test is kept separate from i32 version and it is
disabled for SI for the same reason.

R600 uses different alg. for i64 UDIVREM (the one in ReplaceNodeResults,
moved in this patch) because it can't do 64 bit reciprocals.


I have tried to fix this situation by custom lowering MULHU (see the
attached patch), but udivrem64 on SI then fails with 64 bit AND, which
is weird since i64 AND tests pass.

llc -march=r600 -mcpu=SI < test/CodeGen/R600/udivrem64.ll
Cannot select: 0x1ac4d60: i64 = and 0x1acfcd8, 0x1acfff0

If UDIVREM is fixed, all the related stuff (SDIV/SREM/SDIVREM/UDIV/UREM)
will work for i64.

regards,
jan

-- 
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-SI-Implement-custom-MULHU.patch
Type: text/x-patch
Size: 3654 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140624/b633d99d/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140624/b633d99d/attachment.sig>


More information about the llvm-commits mailing list