<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 5/9/2017 12:19 PM, Tim Shen wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAG4ZjNkUHWCTFHo22G6eRjhJPbnfmLyzGkwV3wkRfd7OrF3mdg@mail.gmail.com">

      <div dir="ltr">Sorry for missing the email. :/<br>

        <br>

        <div class="gmail_quote">

          <div dir="ltr">On Wed, May 3, 2017 at 11:43 AM Friedman, Eli

            via llvm-commits <<a

              href="mailto:llvm-commits@lists.llvm.org"

              moz-do-not-send="true">llvm-commits@lists.llvm.org</a>>

            wrote:<br>

          </div>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">On

            5/2/2017 5:07 PM, Tim Shen via llvm-commits wrote:<br>

            > Author: timshen<br>

            > Date: Tue May  2 19:07:02 2017<br>

            > New Revision: 301990<br>

            ><br>

            > URL: <a

              href="http://llvm.org/viewvc/llvm-project?rev=301990&view=rev"

              rel="noreferrer" target="_blank" moz-do-not-send="true">http://llvm.org/viewvc/llvm-project?rev=301990&view=rev</a><br>

            > Log:<br>

            > [PowerPC, DAGCombiner] Fold a << (b % (sizeof(a)

            * 8)) back to a single instruction<br>

            ><br>

            > Summary:<br>

            > This is the corresponding llvm change to D28037 to

            ensure no performance<br>

            > regression.<br>

            ><br>

            > Reviewers: bogner, kbarton, hfinkel, iteratee, echristo<br>

            ><br>

            > Subscribers: nemanjai, llvm-commits<br>

            ><br>

            > Differential Revision: <a

              href="https://reviews.llvm.org/D28329" rel="noreferrer"

              target="_blank" moz-do-not-send="true">https://reviews.llvm.org/D28329</a><br>

            ><br>

            > Modified:<br>

            >      llvm/trunk/include/llvm/Target/TargetLowering.h<br>

            >     

            llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>

            >      llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h<br>

            >      llvm/trunk/test/CodeGen/PowerPC/shift_mask.ll<br>

            ><br>

            > Modified:

            llvm/trunk/include/llvm/Target/TargetLowering.h<br>

            > URL: <a

href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetLowering.h?rev=301990&r1=301989&r2=301990&view=diff"

              rel="noreferrer" target="_blank" moz-do-not-send="true">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetLowering.h?rev=301990&r1=301989&r2=301990&view=diff</a><br>

            >

==============================================================================<br>

            > --- llvm/trunk/include/llvm/Target/TargetLowering.h

            (original)<br>

            > +++ llvm/trunk/include/llvm/Target/TargetLowering.h Tue

            May  2 19:07:02 2017<br>

            > @@ -2061,6 +2061,14 @@ public:<br>

            >       return false;<br>

            >     }<br>

            ><br>

            > +  // Return true if the instruction that performs a

            << b actually performs<br>

            > +  // a << (b % (sizeof(a) * 8)).<br>

            > +  virtual bool supportsModuloShift(ISD::NodeType Inst,

            EVT ReturnType) const {<br>

            > +    assert((Inst == ISD::SHL || Inst == ISD::SRA ||

            Inst == ISD::SRL) &&<br>

            > +           "Expect a shift instruction");<br>

            > +    return false;<br>

            > +  }<br>

            > +<br>

            >   

 //===--------------------------------------------------------------------===//<br>

            >     // Runtime Library hooks<br>

            >     //<br>

            ><br>

            > Modified:

            llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>

            > URL: <a

href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=301990&r1=301989&r2=301990&view=diff"

              rel="noreferrer" target="_blank" moz-do-not-send="true">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=301990&r1=301989&r2=301990&view=diff</a><br>

            >

==============================================================================<br>

            > --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

            (original)<br>

            > +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

            Tue May  2 19:07:02 2017<br>

            > @@ -5294,6 +5294,17 @@ SDValue

            DAGCombiner::visitSHL(SDNode *N)<br>

            >       }<br>

            >     }<br>

            ><br>

            > +  // If the target supports masking y in (shl, y),<br>

            > +  // fold (shl x, (and y, ((1 << numbits(x)) -

            1))) -> (shl x, y)<br>

            > +  if (TLI.isOperationLegal(ISD::SHL, VT) &&<br>

            > +      TLI.supportsModuloShift(ISD::SHL, VT) &&

            N1->getOpcode() == ISD::AND) {<br>

            > +    if (ConstantSDNode *Mask =

            isConstOrConstSplat(N1->getOperand(1))) {<br>

            > +      if (Mask->getZExtValue() == OpSizeInBits - 1)

            {<br>

            > +        return DAG.getNode(ISD::SHL, SDLoc(N), VT, N0,

            N1->getOperand(0));<br>

            > +      }<br>

            > +    }<br>

            > +  }<br>

            > +<br>

            >     ConstantSDNode *N1C = isConstOrConstSplat(N1);<br>

            ><br>

            >     // fold (shl c1, c2) -> c1<<c2<br>

            > @@ -5492,6 +5503,17 @@ SDValue

            DAGCombiner::visitSRA(SDNode *N)<br>

            >     EVT VT = N0.getValueType();<br>

            >     unsigned OpSizeInBits = VT.getScalarSizeInBits();<br>

            ><br>

            > +  // If the target supports masking y in (sra, y),<br>

            > +  // fold (sra x, (and y, ((1 << numbits(x)) -

            1))) -> (sra x, y)<br>

            > +  if (TLI.isOperationLegal(ISD::SRA, VT) &&<br>

            > +      TLI.supportsModuloShift(ISD::SRA, VT) &&

            N1->getOpcode() == ISD::AND) {<br>

            > +    if (ConstantSDNode *Mask =

            isConstOrConstSplat(N1->getOperand(1))) {<br>

            > +      if (Mask->getZExtValue() == OpSizeInBits - 1)

            {<br>

            > +        return DAG.getNode(ISD::SRA, SDLoc(N), VT, N0,

            N1->getOperand(0));<br>

            > +      }<br>

            > +    }<br>

            > +  }<br>

            > +<br>

            >     // Arithmetic shifting an all-sign-bit value is a

            no-op.<br>

            >     if (DAG.ComputeNumSignBits(N0) == OpSizeInBits)<br>

            >       return N0;<br>

            > @@ -5650,6 +5672,17 @@ SDValue

            DAGCombiner::visitSRL(SDNode *N)<br>

            >     EVT VT = N0.getValueType();<br>

            >     unsigned OpSizeInBits = VT.getScalarSizeInBits();<br>

            ><br>

            > +  // If the target supports masking y in (srl, y),<br>

            > +  // fold (srl x, (and y, ((1 << numbits(x)) -

            1))) -> (srl x, y)<br>

            > +  if (TLI.isOperationLegal(ISD::SRL, VT) &&<br>

            > +      TLI.supportsModuloShift(ISD::SRL, VT) &&

            N1->getOpcode() == ISD::AND) {<br>

            > +    if (ConstantSDNode *Mask =

            isConstOrConstSplat(N1->getOperand(1))) {<br>

            > +      if (Mask->getZExtValue() == OpSizeInBits - 1)

            {<br>

            > +        return DAG.getNode(ISD::SRL, SDLoc(N), VT, N0,

            N1->getOperand(0));<br>

            > +      }<br>

            > +    }<br>

            > +  }<br>

            <br>

            I see three problems here:<br>

            <br>

            1. It needs to be documented.<br>

          </blockquote>

          <div><br>

          </div>

          <div>I wrote comments on the transformation and the TLI hook.

            What kind of documentation do you expect?</div>

        </div>

      </div>

    </blockquote>

    <br>

    A comment in ISDOpcodes.h.<br>

    <br>

    <blockquote type="cite"

cite="mid:CAG4ZjNkUHWCTFHo22G6eRjhJPbnfmLyzGkwV3wkRfd7OrF3mdg@mail.gmail.com">

      <div dir="ltr">

        <div class="gmail_quote">

          <div> </div>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            2. This isn't consistent with other transforms done by

            DAGCombine, e.g.<br>

            constant folding won't honor this.<br>

          </blockquote>

          <div><br>

          </div>

          <div>How so? Although it runs before constant folding

            (e.g. (shl c1, c2) -> c1<<c2), constant folding

            will pick it up during the next iteration, won't it?</div>

        </div>

      </div>

    </blockquote>

    <br>

    Suppose you have "SHL(1, AND(y, 31))".  We fold it to "SHL(1, y)"

    because supportsModuloShift is true.  Then we figure out that y is

    32.  Then we constant-fold "SHL(1, 32)" to zero (because that's the

    default behavior for APInt).<br>

    <br>

    And that's only the one I thought of off the top of my head; there

    are probably other transforms which assume the shift amount is in

    range.<br>

    <br>

    <blockquote type="cite"

cite="mid:CAG4ZjNkUHWCTFHo22G6eRjhJPbnfmLyzGkwV3wkRfd7OrF3mdg@mail.gmail.com">

      <div dir="ltr">

        <div class="gmail_quote">

          <div> </div>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            3. I'm not sure we want to do this; are we actually getting

            any benefit<br>

            here over letting targets pattern-match the pattern?<br>

          </blockquote>

          <div><br>

          </div>

          <div>At least for both PowerPC and ARM, they have the modulo

            behavior:</div>

          <div><br>

          </div>

          <div>For PowerPC see the OpenPower ISA 3.0.</div>

          <div><br>

          </div>

          <div>For ARM, I found this:</div>

          <div><a

href="https://developer.arm.com/docs/den0024/latest/6-the-a64-instruction-set/62-data-processing-instructions/623-shift-operations"

              moz-do-not-send="true">https://developer.arm.com/docs/den0024/latest/6-the-a64-instruction-set/62-data-processing-instructions/623-shift-operations</a></div>

          <div><br>

          </div>

          <div>Ideally someone can turn it on for ARM, but until now I

            didn't know ARM's behavior.</div>

          <div> </div>

        </div>

      </div>

    </blockquote>

    <br>

    That wasn't what I was trying to say.  I mean, what benefit does

    this have that you don't get if you just write a pattern like "def :

    Pat<(shl x:v4i32, (and y:v4i32, (splat 31)), (VSLW x, y)>;".<br>

    <br>

    I'm also concerned that most people will assume ISD::SHL has the

    same semantics as LLVM IR shl for out-of-range shifts, so future

    changes are likely to break this.<br>

    <br>

    -Eli<br>

    <pre class="moz-signature" cols="72">-- 

Employee of Qualcomm Innovation Center, Inc.

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>

  </body>

</html>