Fix LowerSDIV24

Wed Jul 9 23:15:16 PDT 2014

On Jun 26, 2014, at 3:43 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote:

> On Wed, 2014-06-25 at 21:48 -0700, Matt Arsenault wrote:
>> Hi,
>> 
>> These reenable LowerSDIV24 to optimize division when it’s know
>> converting to float won’t lose precision + piglit test cases that
>> should break for values where this isn’t OK
> 
> Have you considered applying this to SDVIREM?
> getting Rem part is easy, we only need MULADD (before or after
> conversion to INT).
> 

I think that would be pointless. Expanded SREM with a custom SDIV will expand to be the same X - Y * (X / Y), and then the division will be handled this way. I’ll add a test that shows this ends up getting used with srem

> I tried moving the call to LowerSDIVREM (and add mull + add to get rem)
> to see how it compares against expanding SREM to use SDIV.
> 
> the code for 24 bit ops was the same for SDIVREM and expanded SREM
> sdiv32 code was also the same.
> SDIVREM approach did better in rem32 op:
> -5 inst on SI, -4 inst (2 IGs) on EG
> 
> 
> 
> I could not get any of my clients to inline the patches so here are few
> snippets. Otherwise the patches look good to me.
> 
> SDValue AMDGPUTargetLowering::LowerSDIV(SDValue Op, SelectionDAG &DAG)
> const {
>   EVT OVT = Op.getValueType().getScalarType();
> 
> -  if (OVT == MVT::i64)
> -    return LowerSDIV64(Op, DAG);
> +  if (OVT == MVT::i32) {
> +    if (DAG.ComputeNumSignBits(Op.getOperand(0)) > 8 &&
> +        DAG.ComputeNumSignBits(Op.getOperand(1)) > 8) {
> +      // TODO: We technically could do this for i64, but shouldn't that
> just be
> +      // handled by something generally reducing 64-bit division on
> 32-bit
> +      // values to 32-bit?
> 
> I think it's LowerSDIV that should perform that kind of reduction, and
> we'd see i64 type here. We can also call this from R600
> ReplaceNodeResults so the optimizer does not have to go through the
> iterative algorithm.
> 
> +      return LowerSDIV24(Op, DAG);
> +    }
> 
> -  if (OVT.getScalarType() == MVT::i32)
>     return LowerSDIV32(Op, DAG);
> -
> -  if (OVT == MVT::i16 || OVT == MVT::i8) {
> -    // FIXME: We should be checking for the masked bits. This isn't
> reached
> -    // because i8 and i16 are not legal types.
> -    return LowerSDIV24(Op, DAG);
>   }
> 
> -  return SDValue(Op.getNode(), 0);
> +  assert(OVT == MVT::i64);
> +  return LowerSDIV64(Op, DAG);
> 
> I think instead of calling LowerSDIV32/64 you can just use the SDIV code
> from R600 ReplaceNodeResults here, and add a call to this function
> there. Then we can remove LowerSDIV32/64 entirely
> 
> }
> 
> jan
> 
> -- 
> Jan Vesely <jan.vesely at rutgers.edu>