[llvm] r301990 - [PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a single instruction
Friedman, Eli via llvm-commits
llvm-commits at lists.llvm.org
Tue May 9 12:44:57 PDT 2017
On 5/9/2017 12:19 PM, Tim Shen wrote:
> Sorry for missing the email. :/
>
> On Wed, May 3, 2017 at 11:43 AM Friedman, Eli via llvm-commits
> <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>
> On 5/2/2017 5:07 PM, Tim Shen via llvm-commits wrote:
> > Author: timshen
> > Date: Tue May 2 19:07:02 2017
> > New Revision: 301990
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=301990&view=rev
> > Log:
> > [PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a
> single instruction
> >
> > Summary:
> > This is the corresponding llvm change to D28037 to ensure no
> performance
> > regression.
> >
> > Reviewers: bogner, kbarton, hfinkel, iteratee, echristo
> >
> > Subscribers: nemanjai, llvm-commits
> >
> > Differential Revision: https://reviews.llvm.org/D28329
> >
> > Modified:
> > llvm/trunk/include/llvm/Target/TargetLowering.h
> > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> > llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h
> > llvm/trunk/test/CodeGen/PowerPC/shift_mask.ll
> >
> > Modified: llvm/trunk/include/llvm/Target/TargetLowering.h
> > URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetLowering.h?rev=301990&r1=301989&r2=301990&view=diff
> >
> ==============================================================================
> > --- llvm/trunk/include/llvm/Target/TargetLowering.h (original)
> > +++ llvm/trunk/include/llvm/Target/TargetLowering.h Tue May 2
> 19:07:02 2017
> > @@ -2061,6 +2061,14 @@ public:
> > return false;
> > }
> >
> > + // Return true if the instruction that performs a << b
> actually performs
> > + // a << (b % (sizeof(a) * 8)).
> > + virtual bool supportsModuloShift(ISD::NodeType Inst, EVT
> ReturnType) const {
> > + assert((Inst == ISD::SHL || Inst == ISD::SRA || Inst ==
> ISD::SRL) &&
> > + "Expect a shift instruction");
> > + return false;
> > + }
> > +
> >
> //===--------------------------------------------------------------------===//
> > // Runtime Library hooks
> > //
> >
> > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> > URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=301990&r1=301989&r2=301990&view=diff
> >
> ==============================================================================
> > --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
> > +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Tue May
> 2 19:07:02 2017
> > @@ -5294,6 +5294,17 @@ SDValue DAGCombiner::visitSHL(SDNode *N)
> > }
> > }
> >
> > + // If the target supports masking y in (shl, y),
> > + // fold (shl x, (and y, ((1 << numbits(x)) - 1))) -> (shl x, y)
> > + if (TLI.isOperationLegal(ISD::SHL, VT) &&
> > + TLI.supportsModuloShift(ISD::SHL, VT) && N1->getOpcode()
> == ISD::AND) {
> > + if (ConstantSDNode *Mask =
> isConstOrConstSplat(N1->getOperand(1))) {
> > + if (Mask->getZExtValue() == OpSizeInBits - 1) {
> > + return DAG.getNode(ISD::SHL, SDLoc(N), VT, N0,
> N1->getOperand(0));
> > + }
> > + }
> > + }
> > +
> > ConstantSDNode *N1C = isConstOrConstSplat(N1);
> >
> > // fold (shl c1, c2) -> c1<<c2
> > @@ -5492,6 +5503,17 @@ SDValue DAGCombiner::visitSRA(SDNode *N)
> > EVT VT = N0.getValueType();
> > unsigned OpSizeInBits = VT.getScalarSizeInBits();
> >
> > + // If the target supports masking y in (sra, y),
> > + // fold (sra x, (and y, ((1 << numbits(x)) - 1))) -> (sra x, y)
> > + if (TLI.isOperationLegal(ISD::SRA, VT) &&
> > + TLI.supportsModuloShift(ISD::SRA, VT) && N1->getOpcode()
> == ISD::AND) {
> > + if (ConstantSDNode *Mask =
> isConstOrConstSplat(N1->getOperand(1))) {
> > + if (Mask->getZExtValue() == OpSizeInBits - 1) {
> > + return DAG.getNode(ISD::SRA, SDLoc(N), VT, N0,
> N1->getOperand(0));
> > + }
> > + }
> > + }
> > +
> > // Arithmetic shifting an all-sign-bit value is a no-op.
> > if (DAG.ComputeNumSignBits(N0) == OpSizeInBits)
> > return N0;
> > @@ -5650,6 +5672,17 @@ SDValue DAGCombiner::visitSRL(SDNode *N)
> > EVT VT = N0.getValueType();
> > unsigned OpSizeInBits = VT.getScalarSizeInBits();
> >
> > + // If the target supports masking y in (srl, y),
> > + // fold (srl x, (and y, ((1 << numbits(x)) - 1))) -> (srl x, y)
> > + if (TLI.isOperationLegal(ISD::SRL, VT) &&
> > + TLI.supportsModuloShift(ISD::SRL, VT) && N1->getOpcode()
> == ISD::AND) {
> > + if (ConstantSDNode *Mask =
> isConstOrConstSplat(N1->getOperand(1))) {
> > + if (Mask->getZExtValue() == OpSizeInBits - 1) {
> > + return DAG.getNode(ISD::SRL, SDLoc(N), VT, N0,
> N1->getOperand(0));
> > + }
> > + }
> > + }
>
> I see three problems here:
>
> 1. It needs to be documented.
>
>
> I wrote comments on the transformation and the TLI hook. What kind of
> documentation do you expect?
A comment in ISDOpcodes.h.
> 2. This isn't consistent with other transforms done by DAGCombine,
> e.g.
> constant folding won't honor this.
>
>
> How so? Although it runs before constant folding (e.g. (shl c1, c2) ->
> c1<<c2), constant folding will pick it up during the next iteration,
> won't it?
Suppose you have "SHL(1, AND(y, 31))". We fold it to "SHL(1, y)"
because supportsModuloShift is true. Then we figure out that y is 32.
Then we constant-fold "SHL(1, 32)" to zero (because that's the default
behavior for APInt).
And that's only the one I thought of off the top of my head; there are
probably other transforms which assume the shift amount is in range.
> 3. I'm not sure we want to do this; are we actually getting any
> benefit
> here over letting targets pattern-match the pattern?
>
>
> At least for both PowerPC and ARM, they have the modulo behavior:
>
> For PowerPC see the OpenPower ISA 3.0.
>
> For ARM, I found this:
> https://developer.arm.com/docs/den0024/latest/6-the-a64-instruction-set/62-data-processing-instructions/623-shift-operations
>
> Ideally someone can turn it on for ARM, but until now I didn't know
> ARM's behavior.
That wasn't what I was trying to say. I mean, what benefit does this
have that you don't get if you just write a pattern like "def : Pat<(shl
x:v4i32, (and y:v4i32, (splat 31)), (VSLW x, y)>;".
I'm also concerned that most people will assume ISD::SHL has the same
semantics as LLVM IR shl for out-of-range shifts, so future changes are
likely to break this.
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170509/d41459d0/attachment.html>
More information about the llvm-commits
mailing list