[llvm] r301990 - [PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a single instruction

Tue May 9 12:44:57 PDT 2017

On 5/9/2017 12:19 PM, Tim Shen wrote:
> Sorry for missing the email. :/
>
> On Wed, May 3, 2017 at 11:43 AM Friedman, Eli via llvm-commits 
> <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>
>     On 5/2/2017 5:07 PM, Tim Shen via llvm-commits wrote:
>     > Author: timshen
>     > Date: Tue May  2 19:07:02 2017
>     > New Revision: 301990
>     >
>     > URL: http://llvm.org/viewvc/llvm-project?rev=301990&view=rev
>     > Log:
>     > [PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a
>     single instruction
>     >
>     > Summary:
>     > This is the corresponding llvm change to D28037 to ensure no
>     performance
>     > regression.
>     >
>     > Reviewers: bogner, kbarton, hfinkel, iteratee, echristo
>     >
>     > Subscribers: nemanjai, llvm-commits
>     >
>     > Differential Revision: https://reviews.llvm.org/D28329
>     >
>     > Modified:
>     >      llvm/trunk/include/llvm/Target/TargetLowering.h
>     > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>     >      llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h
>     >      llvm/trunk/test/CodeGen/PowerPC/shift_mask.ll
>     >
>     > Modified: llvm/trunk/include/llvm/Target/TargetLowering.h
>     > URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetLowering.h?rev=301990&r1=301989&r2=301990&view=diff
>     >
>     ==============================================================================
>     > --- llvm/trunk/include/llvm/Target/TargetLowering.h (original)
>     > +++ llvm/trunk/include/llvm/Target/TargetLowering.h Tue May  2
>     19:07:02 2017
>     > @@ -2061,6 +2061,14 @@ public:
>     >       return false;
>     >     }
>     >
>     > +  // Return true if the instruction that performs a << b
>     actually performs
>     > +  // a << (b % (sizeof(a) * 8)).
>     > +  virtual bool supportsModuloShift(ISD::NodeType Inst, EVT
>     ReturnType) const {
>     > +    assert((Inst == ISD::SHL || Inst == ISD::SRA || Inst ==
>     ISD::SRL) &&
>     > +           "Expect a shift instruction");
>     > +    return false;
>     > +  }
>     > +
>     >
>      //===--------------------------------------------------------------------===//
>     >     // Runtime Library hooks
>     >     //
>     >
>     > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>     > URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=301990&r1=301989&r2=301990&view=diff
>     >
>     ==============================================================================
>     > --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
>     > +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Tue May 
>     2 19:07:02 2017
>     > @@ -5294,6 +5294,17 @@ SDValue DAGCombiner::visitSHL(SDNode *N)
>     >       }
>     >     }
>     >
>     > +  // If the target supports masking y in (shl, y),
>     > +  // fold (shl x, (and y, ((1 << numbits(x)) - 1))) -> (shl x, y)
>     > +  if (TLI.isOperationLegal(ISD::SHL, VT) &&
>     > +      TLI.supportsModuloShift(ISD::SHL, VT) && N1->getOpcode()
>     == ISD::AND) {
>     > +    if (ConstantSDNode *Mask =
>     isConstOrConstSplat(N1->getOperand(1))) {
>     > +      if (Mask->getZExtValue() == OpSizeInBits - 1) {
>     > +        return DAG.getNode(ISD::SHL, SDLoc(N), VT, N0,
>     N1->getOperand(0));
>     > +      }
>     > +    }
>     > +  }
>     > +
>     >     ConstantSDNode *N1C = isConstOrConstSplat(N1);
>     >
>     >     // fold (shl c1, c2) -> c1<<c2
>     > @@ -5492,6 +5503,17 @@ SDValue DAGCombiner::visitSRA(SDNode *N)
>     >     EVT VT = N0.getValueType();
>     >     unsigned OpSizeInBits = VT.getScalarSizeInBits();
>     >
>     > +  // If the target supports masking y in (sra, y),
>     > +  // fold (sra x, (and y, ((1 << numbits(x)) - 1))) -> (sra x, y)
>     > +  if (TLI.isOperationLegal(ISD::SRA, VT) &&
>     > +      TLI.supportsModuloShift(ISD::SRA, VT) && N1->getOpcode()
>     == ISD::AND) {
>     > +    if (ConstantSDNode *Mask =
>     isConstOrConstSplat(N1->getOperand(1))) {
>     > +      if (Mask->getZExtValue() == OpSizeInBits - 1) {
>     > +        return DAG.getNode(ISD::SRA, SDLoc(N), VT, N0,
>     N1->getOperand(0));
>     > +      }
>     > +    }
>     > +  }
>     > +
>     >     // Arithmetic shifting an all-sign-bit value is a no-op.
>     >     if (DAG.ComputeNumSignBits(N0) == OpSizeInBits)
>     >       return N0;
>     > @@ -5650,6 +5672,17 @@ SDValue DAGCombiner::visitSRL(SDNode *N)
>     >     EVT VT = N0.getValueType();
>     >     unsigned OpSizeInBits = VT.getScalarSizeInBits();
>     >
>     > +  // If the target supports masking y in (srl, y),
>     > +  // fold (srl x, (and y, ((1 << numbits(x)) - 1))) -> (srl x, y)
>     > +  if (TLI.isOperationLegal(ISD::SRL, VT) &&
>     > +      TLI.supportsModuloShift(ISD::SRL, VT) && N1->getOpcode()
>     == ISD::AND) {
>     > +    if (ConstantSDNode *Mask =
>     isConstOrConstSplat(N1->getOperand(1))) {
>     > +      if (Mask->getZExtValue() == OpSizeInBits - 1) {
>     > +        return DAG.getNode(ISD::SRL, SDLoc(N), VT, N0,
>     N1->getOperand(0));
>     > +      }
>     > +    }
>     > +  }
>
>     I see three problems here:
>
>     1. It needs to be documented.
>
>
> I wrote comments on the transformation and the TLI hook. What kind of 
> documentation do you expect?

A comment in ISDOpcodes.h.

>     2. This isn't consistent with other transforms done by DAGCombine,
>     e.g.
>     constant folding won't honor this.
>
>
> How so? Although it runs before constant folding (e.g. (shl c1, c2) -> 
> c1<<c2), constant folding will pick it up during the next iteration, 
> won't it?

Suppose you have "SHL(1, AND(y, 31))".  We fold it to "SHL(1, y)" 
because supportsModuloShift is true.  Then we figure out that y is 32.  
Then we constant-fold "SHL(1, 32)" to zero (because that's the default 
behavior for APInt).

And that's only the one I thought of off the top of my head; there are 
probably other transforms which assume the shift amount is in range.

>     3. I'm not sure we want to do this; are we actually getting any
>     benefit
>     here over letting targets pattern-match the pattern?
>
>
> At least for both PowerPC and ARM, they have the modulo behavior:
>
> For PowerPC see the OpenPower ISA 3.0.
>
> For ARM, I found this:
> https://developer.arm.com/docs/den0024/latest/6-the-a64-instruction-set/62-data-processing-instructions/623-shift-operations
>
> Ideally someone can turn it on for ARM, but until now I didn't know 
> ARM's behavior.

That wasn't what I was trying to say.  I mean, what benefit does this 
have that you don't get if you just write a pattern like "def : Pat<(shl 
x:v4i32, (and y:v4i32, (splat 31)), (VSLW x, y)>;".

I'm also concerned that most people will assume ISD::SHL has the same 
semantics as LLVM IR shl for out-of-range shifts, so future changes are 
likely to break this.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170509/d41459d0/attachment.html>