[PATCH 1/4] R600: Simplify LowerUDIVREM
Jan Vesely
jan.vesely at rutgers.edu
Thu Oct 16 14:23:39 PDT 2014
On Thu, 2014-10-16 at 15:00 -0400, Tom Stellard wrote:
> On Thu, Oct 16, 2014 at 02:41:42PM -0400, Jan Vesely wrote:
> > optimizations can handle removing the Hi part operations.
> > The generated code is identical
>
> Hi Jan,
>
> This is somewhat unrelated to this patch, but I think there are some
> bugs in the lowering for 64-bit division.
>
> See patch #5:
> R600: Factor i64 UDIVREM lowering into its own fuction
>
> From this email:
> http://marc.info/?l=llvm-commits&m=141150582511354&w=2
>
> One of the failing cases is:
>
> 0x1 / 0xffffffffffffffff = 0xffffffff (should be 0x0).
The problem was using signed comparisons.
see the attached patch.
I thought SI was going to use fake urecip int (like cayman, but for
64bits too).
I'll rebase this series on top of the referenced patch (or I can change
the 2nd patch to move the function to AMDGPUISelLowering)
jan
>
> The other patches in that series add a pass to use the
> LLVM IR based division lowering, which works fine on
> at least on SI.
>
> It would be great if we could fix the SDAG lowering version
> since that would potentially produce much better code, otherwise
> I can continue with the LLVM IR base lowering approach.
>
> If you do try to fix this, I think it would be good to apply the
> patch referenced above first to make it easier to share code between
> SI and R600.
>
> -Tom
>
> >
> > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > ---
> > lib/Target/R600/R600ISelLowering.cpp | 30 +++++++++++-------------------
> > 1 file changed, 11 insertions(+), 19 deletions(-)
> >
> > diff --git a/lib/Target/R600/R600ISelLowering.cpp b/lib/Target/R600/R600ISelLowering.cpp
> > index 3b4aed0..90a1f2f 100644
> > --- a/lib/Target/R600/R600ISelLowering.cpp
> > +++ b/lib/Target/R600/R600ISelLowering.cpp
> > @@ -893,8 +893,8 @@ void R600TargetLowering::ReplaceNodeResults(SDNode *N,
> > SDValue DIV_Part = DAG.getNode(ISD::UDIV, DL, HalfVT, LHS_Hi, RHS_Lo);
> > SDValue REM_Part = DAG.getNode(ISD::UREM, DL, HalfVT, LHS_Hi, RHS_Lo);
> >
> > - SDValue REM_Hi = zero;
> > SDValue REM_Lo = DAG.getSelectCC(DL, RHS_Hi, zero, REM_Part, LHS_Hi, ISD::SETEQ);
> > + SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, zero);
> >
> > SDValue DIV_Hi = DAG.getSelectCC(DL, RHS_Hi, zero, DIV_Part, zero, ISD::SETEQ);
> > SDValue DIV_Lo = zero;
> > @@ -902,8 +902,10 @@ void R600TargetLowering::ReplaceNodeResults(SDNode *N,
> > const unsigned halfBitWidth = HalfVT.getSizeInBits();
> >
> > for (unsigned i = 0; i < halfBitWidth; ++i) {
> > - SDValue POS = DAG.getConstant(halfBitWidth - i - 1, HalfVT);
> > - // Get Value of high bit
> > + const unsigned bitPos = halfBitWidth - i - 1;
> > + SDValue POS = DAG.getConstant(bitPos, HalfVT);
> > + // Get value of high bit
> > + // TODO: Remove the BFE part when the optimization is fixed
> > SDValue HBit;
> > if (halfBitWidth == 32 && Subtarget->hasBFE()) {
> > HBit = DAG.getNode(AMDGPUISD::BFE_U32, DL, HalfVT, LHS_Lo, POS, one);
> > @@ -911,33 +913,23 @@ void R600TargetLowering::ReplaceNodeResults(SDNode *N,
> > HBit = DAG.getNode(ISD::SRL, DL, HalfVT, LHS_Lo, POS);
> > HBit = DAG.getNode(ISD::AND, DL, HalfVT, HBit, one);
> > }
> > + HBit = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, HBit);
> >
> > - SDValue Carry = DAG.getNode(ISD::SRL, DL, HalfVT, REM_Lo,
> > - DAG.getConstant(halfBitWidth - 1, HalfVT));
> > - REM_Hi = DAG.getNode(ISD::SHL, DL, HalfVT, REM_Hi, one);
> > - REM_Hi = DAG.getNode(ISD::OR, DL, HalfVT, REM_Hi, Carry);
> > + // Shift
> > + REM = DAG.getNode(ISD::SHL, DL, VT, REM, DAG.getConstant(1, VT));
> > + // Add LHS high bit
> > + REM = DAG.getNode(ISD::OR, DL, VT, REM, HBit);
> >
> > - REM_Lo = DAG.getNode(ISD::SHL, DL, HalfVT, REM_Lo, one);
> > - REM_Lo = DAG.getNode(ISD::OR, DL, HalfVT, REM_Lo, HBit);
> > -
> > -
> > - SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, REM_Hi);
> > -
> > - SDValue BIT = DAG.getConstant(1 << (halfBitWidth - i - 1), HalfVT);
> > + SDValue BIT = DAG.getConstant(1 << bitPos, HalfVT);
> > SDValue realBIT = DAG.getSelectCC(DL, REM, RHS, BIT, zero, ISD::SETGE);
> >
> > DIV_Lo = DAG.getNode(ISD::OR, DL, HalfVT, DIV_Lo, realBIT);
> >
> > // Update REM
> > -
> > SDValue REM_sub = DAG.getNode(ISD::SUB, DL, VT, REM, RHS);
> > -
> > REM = DAG.getSelectCC(DL, REM, RHS, REM_sub, REM, ISD::SETGE);
> > - REM_Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, REM, zero);
> > - REM_Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, HalfVT, REM, one);
> > }
> >
> > - SDValue REM = DAG.getNode(ISD::BUILD_PAIR, DL, VT, REM_Lo, REM_Hi);
> > SDValue DIV = DAG.getNode(ISD::BUILD_PAIR, DL, VT, DIV_Lo, DIV_Hi);
> > Results.push_back(DIV);
> > Results.push_back(REM);
> > --
> > 1.9.3
> >
--
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix.patch
Type: text/x-patch
Size: 960 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141016/6e5ec22e/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141016/6e5ec22e/attachment.sig>
More information about the llvm-commits
mailing list