Fix LowerSDIV24
Matt Arsenault
arsenm2 at gmail.com
Wed Jul 9 23:15:16 PDT 2014
On Jun 26, 2014, at 3:43 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> On Wed, 2014-06-25 at 21:48 -0700, Matt Arsenault wrote:
>> Hi,
>>
>> These reenable LowerSDIV24 to optimize division when it’s know
>> converting to float won’t lose precision + piglit test cases that
>> should break for values where this isn’t OK
>
> Have you considered applying this to SDVIREM?
> getting Rem part is easy, we only need MULADD (before or after
> conversion to INT).
>
I think that would be pointless. Expanded SREM with a custom SDIV will expand to be the same X - Y * (X / Y), and then the division will be handled this way. I’ll add a test that shows this ends up getting used with srem
> I tried moving the call to LowerSDIVREM (and add mull + add to get rem)
> to see how it compares against expanding SREM to use SDIV.
>
> the code for 24 bit ops was the same for SDIVREM and expanded SREM
> sdiv32 code was also the same.
> SDIVREM approach did better in rem32 op:
> -5 inst on SI, -4 inst (2 IGs) on EG
>
>
>
> I could not get any of my clients to inline the patches so here are few
> snippets. Otherwise the patches look good to me.
>
> SDValue AMDGPUTargetLowering::LowerSDIV(SDValue Op, SelectionDAG &DAG)
> const {
> EVT OVT = Op.getValueType().getScalarType();
>
> - if (OVT == MVT::i64)
> - return LowerSDIV64(Op, DAG);
> + if (OVT == MVT::i32) {
> + if (DAG.ComputeNumSignBits(Op.getOperand(0)) > 8 &&
> + DAG.ComputeNumSignBits(Op.getOperand(1)) > 8) {
> + // TODO: We technically could do this for i64, but shouldn't that
> just be
> + // handled by something generally reducing 64-bit division on
> 32-bit
> + // values to 32-bit?
>
> I think it's LowerSDIV that should perform that kind of reduction, and
> we'd see i64 type here. We can also call this from R600
> ReplaceNodeResults so the optimizer does not have to go through the
> iterative algorithm.
>
> + return LowerSDIV24(Op, DAG);
> + }
>
> - if (OVT.getScalarType() == MVT::i32)
> return LowerSDIV32(Op, DAG);
> -
> - if (OVT == MVT::i16 || OVT == MVT::i8) {
> - // FIXME: We should be checking for the masked bits. This isn't
> reached
> - // because i8 and i16 are not legal types.
> - return LowerSDIV24(Op, DAG);
> }
>
> - return SDValue(Op.getNode(), 0);
> + assert(OVT == MVT::i64);
> + return LowerSDIV64(Op, DAG);
>
> I think instead of calling LowerSDIV32/64 you can just use the SDIV code
> from R600 ReplaceNodeResults here, and add a call to this function
> there. Then we can remove LowerSDIV32/64 entirely
>
> }
>
> jan
>
> --
> Jan Vesely <jan.vesely at rutgers.edu>
More information about the llvm-commits
mailing list