[llvm-dev] [RFC] Improving integer divide optimization (related to D12082)

Wed Aug 19 15:48:40 PDT 2015

> On Aug 19, 2015, at 1:45 PM, Steve King via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> In the targets I know, shifts are
> cheaper than divides in both speed and size.

From what I remember, udiv by power of 2 already gets turned into a shift in instcombine; the tricky case is sdiv by power of 2, which takes significantly more than one instruction. The generic implementation is this:

    // Splat the sign bit into the register
    SDValue SGN =
        DAG.getNode(ISD::SRA, DL, VT, N0,
                    DAG.getConstant(VT.getScalarSizeInBits() - 1, DL,
                                    getShiftAmountTy(N0.getValueType())));
    AddToWorklist(SGN.getNode());

    // Add (N0 < 0) ? abs2 - 1 : 0;
    SDValue SRL =
        DAG.getNode(ISD::SRL, DL, VT, SGN,
                    DAG.getConstant(VT.getScalarSizeInBits() - lg2, DL,
                                    getShiftAmountTy(SGN.getValueType())));
    SDValue ADD = DAG.getNode(ISD::ADD, DL, VT, N0, SRL);
    AddToWorklist(SRL.getNode());
    AddToWorklist(ADD.getNode());    // Divide by pow2
    SDValue SRA = DAG.getNode(ISD::SRA, DL, VT, ADD,
                  DAG.getConstant(lg2, DL,
                                  getShiftAmountTy(ADD.getValueType())));

And there’s already a target-specific hook to add a custom implementation (e.g. on PowerPC):

    // Target-specific implementation of sdiv x, pow2.
    if (SDValue Res = BuildSDIVPow2(N))
      return Res;

-— escha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150819/8e447e83/attachment.html>