Fix LowerSDIV24
Jan Vesely
jan.vesely at rutgers.edu
Thu Jun 26 15:43:38 PDT 2014
On Wed, 2014-06-25 at 21:48 -0700, Matt Arsenault wrote:
> Hi,
>
> These reenable LowerSDIV24 to optimize division when it’s know
> converting to float won’t lose precision + piglit test cases that
> should break for values where this isn’t OK
Have you considered applying this to SDVIREM?
getting Rem part is easy, we only need MULADD (before or after
conversion to INT).
I tried moving the call to LowerSDIVREM (and add mull + add to get rem)
to see how it compares against expanding SREM to use SDIV.
the code for 24 bit ops was the same for SDIVREM and expanded SREM
sdiv32 code was also the same.
SDIVREM approach did better in rem32 op:
-5 inst on SI, -4 inst (2 IGs) on EG
I could not get any of my clients to inline the patches so here are few
snippets. Otherwise the patches look good to me.
SDValue AMDGPUTargetLowering::LowerSDIV(SDValue Op, SelectionDAG &DAG)
const {
EVT OVT = Op.getValueType().getScalarType();
- if (OVT == MVT::i64)
- return LowerSDIV64(Op, DAG);
+ if (OVT == MVT::i32) {
+ if (DAG.ComputeNumSignBits(Op.getOperand(0)) > 8 &&
+ DAG.ComputeNumSignBits(Op.getOperand(1)) > 8) {
+ // TODO: We technically could do this for i64, but shouldn't that
just be
+ // handled by something generally reducing 64-bit division on
32-bit
+ // values to 32-bit?
I think it's LowerSDIV that should perform that kind of reduction, and
we'd see i64 type here. We can also call this from R600
ReplaceNodeResults so the optimizer does not have to go through the
iterative algorithm.
+ return LowerSDIV24(Op, DAG);
+ }
- if (OVT.getScalarType() == MVT::i32)
return LowerSDIV32(Op, DAG);
-
- if (OVT == MVT::i16 || OVT == MVT::i8) {
- // FIXME: We should be checking for the masked bits. This isn't
reached
- // because i8 and i16 are not legal types.
- return LowerSDIV24(Op, DAG);
}
- return SDValue(Op.getNode(), 0);
+ assert(OVT == MVT::i64);
+ return LowerSDIV64(Op, DAG);
I think instead of calling LowerSDIV32/64 you can just use the SDIV code
from R600 ReplaceNodeResults here, and add a call to this function
there. Then we can remove LowerSDIV32/64 entirely
}
jan
--
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140626/2f2ba38e/attachment.sig>
More information about the llvm-commits
mailing list