[PATCH 1/4] R600: Simplify LowerUDIVREM

Thu Oct 16 14:23:39 PDT 2014

On Thu, 2014-10-16 at 15:00 -0400, Tom Stellard wrote:
> On Thu, Oct 16, 2014 at 02:41:42PM -0400, Jan Vesely wrote:
> > optimizations can handle removing the Hi part operations.
> > The generated code is identical
> 
> Hi Jan,
> 
> This is somewhat unrelated to this patch, but I think there are some
> bugs in the lowering for 64-bit division.
> 
> See patch #5:
> R600: Factor i64 UDIVREM lowering into its own fuction
> 
> From this email:
> http://marc.info/?l=llvm-commits&m=141150582511354&w=2
> 
> One of the failing cases is:
> 
> 0x1 / 0xffffffffffffffff = 0xffffffff (should be 0x0).

The problem was using signed comparisons.
see the attached patch.

I thought SI was going to use fake urecip int (like cayman, but for
64bits too).
I'll rebase this series on top of the referenced patch (or I can change
the 2nd patch to move the function to AMDGPUISelLowering)

jan

> 
> The other patches in that series add a pass to use the
> LLVM IR based division lowering, which works fine on
> at least on SI.
> 
> It would be great if we could fix the SDAG lowering version
> since that would potentially produce much better code, otherwise
> I can continue with the LLVM IR base lowering approach.
> 
> If you do try to fix this, I think it would be good to apply the
> patch referenced above first to make it easier to share code between
> SI and R600.
> 
> -Tom
> 
> > 
> > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > ---
> >  lib/Target/R600/R600ISelLowering.cpp | 30 +++++++++++-------------------
> >  1 file changed, 11 insertions(+), 19 deletions(-)
> > 
> > diff --git a/lib/Target/R600/R600ISelLowering.cpp b/lib/Target/R600/R600ISelLowering.cpp
> > index 3b4aed0..90a1f2f 100644
> > --- a/lib/Target/R600/R600ISelLowering.cpp
> > +++ b/lib/Target/R600/R600ISelLowering.cpp
> > @@ -893,8 +893,8 @@ void R600TargetLowering::ReplaceNodeResults(SDNode *N,
> >      SDValue DIV_Part = DAG.getNode(ISD::UDIV, DL, HalfVT, LHS_Hi, RHS_Lo);
> >      SDValue REM_Part = DAG.getNode(ISD::UREM, DL, HalfVT, LHS_Hi, RHS_Lo);
> >  
> > -    SDValue REM_Hi = zero;
> >      SDValue REM_Lo = DAG.getSelectCC(DL, RHS_Hi, zero, REM_Part, LHS_Hi, ISD::SETEQ);
> > +    SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, zero);
> >  
> >      SDValue DIV_Hi = DAG.getSelectCC(DL, RHS_Hi, zero, DIV_Part, zero, ISD::SETEQ);
> >      SDValue DIV_Lo = zero;
> > @@ -902,8 +902,10 @@ void R600TargetLowering::ReplaceNodeResults(SDNode *N,
> >      const unsigned halfBitWidth = HalfVT.getSizeInBits();
> >  
> >      for (unsigned i = 0; i < halfBitWidth; ++i) {
> > -      SDValue POS = DAG.getConstant(halfBitWidth - i - 1, HalfVT);
> > -      // Get Value of high bit
> > +      const unsigned bitPos = halfBitWidth - i - 1;
> > +      SDValue POS = DAG.getConstant(bitPos, HalfVT);
> > +      // Get value of high bit
> > +      // TODO: Remove the BFE part when the optimization is fixed
> >        SDValue HBit;
> >        if (halfBitWidth == 32 && Subtarget->hasBFE()) {
> >          HBit = DAG.getNode(AMDGPUISD::BFE_U32, DL, HalfVT, LHS_Lo, POS, one);
> > @@ -911,33 +913,23 @@ void R600TargetLowering::ReplaceNodeResults(SDNode *N,
> >          HBit = DAG.getNode(ISD::SRL, DL, HalfVT, LHS_Lo, POS);
> >          HBit = DAG.getNode(ISD::AND, DL, HalfVT, HBit, one);
> >        }
> > +      HBit = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, HBit);
> >  
> > -      SDValue Carry = DAG.getNode(ISD::SRL, DL, HalfVT, REM_Lo,
> > -        DAG.getConstant(halfBitWidth - 1, HalfVT));
> > -      REM_Hi = DAG.getNode(ISD::SHL, DL, HalfVT, REM_Hi, one);
> > -      REM_Hi = DAG.getNode(ISD::OR, DL, HalfVT, REM_Hi, Carry);
> > +      // Shift
> > +      REM = DAG.getNode(ISD::SHL, DL, VT, REM, DAG.getConstant(1, VT));
> > +      // Add LHS high bit
> > +      REM = DAG.getNode(ISD::OR, DL, VT, REM, HBit);
> >  
> > -      REM_Lo = DAG.getNode(ISD::SHL, DL, HalfVT, REM_Lo, one);
> > -      REM_Lo = DAG.getNode(ISD::OR, DL, HalfVT, REM_Lo, HBit);
> > -
> > -
> > -      SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, REM_Hi);
> > -
> > -      SDValue BIT = DAG.getConstant(1 << (halfBitWidth - i - 1), HalfVT);
> > +      SDValue BIT = DAG.getConstant(1 << bitPos, HalfVT);
> >        SDValue realBIT = DAG.getSelectCC(DL, REM, RHS, BIT, zero, ISD::SETGE);
> >  
> >        DIV_Lo = DAG.getNode(ISD::OR, DL, HalfVT, DIV_Lo, realBIT);
> >  
> >        // Update REM
> > -
> >        SDValue REM_sub = DAG.getNode(ISD::SUB, DL, VT, REM, RHS);
> > -
> >        REM = DAG.getSelectCC(DL, REM, RHS, REM_sub, REM, ISD::SETGE);
> > -      REM_Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, REM, zero);
> > -      REM_Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, REM, one);
> >      }
> >  
> > -    SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, REM_Hi);
> >      SDValue DIV = DAG.getNode(ISD::BUILD_PAIR, DL, VT, DIV_Lo, DIV_Hi);
> >      Results.push_back(DIV);
> >      Results.push_back(REM);
> > -- 
> > 1.9.3
> > 

-- 
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix.patch
Type: text/x-patch
Size: 960 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141016/6e5ec22e/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141016/6e5ec22e/attachment.sig>