<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 5/10/2017 11:13 AM, Tim Shen wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAG4ZjN=BUFJyWfKwtH8bYbymzrcjoxU9LaycUM5Z0EaANVpckw@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr">On Tue, May 9, 2017 at 3:43 PM Friedman, Eli
<<a href="mailto:efriedma@codeaurora.org"
moz-do-not-send="true">efriedma@codeaurora.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div class="m_-6804385229949104446moz-cite-prefix">On
5/9/2017 2:09 PM, Tim Shen wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr">On Tue, May 9, 2017 at 12:45 PM
Friedman, Eli <<a
href="mailto:efriedma@codeaurora.org"
target="_blank" moz-do-not-send="true">efriedma@codeaurora.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div
class="m_-6804385229949104446m_-8130644380894799036moz-cite-prefix">On
5/9/2017 12:19 PM, Tim Shen wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Sorry for missing the email. :/<br>
<br>
<div class="gmail_quote">
<div dir="ltr">On Wed, May 3, 2017 at
11:43 AM Friedman, Eli via llvm-commits
<<a
href="mailto:llvm-commits@lists.llvm.org"
target="_blank" moz-do-not-send="true">llvm-commits@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">On 5/2/2017
5:07 PM, Tim Shen via llvm-commits
wrote:<br>
> Author: timshen<br>
> Date: Tue May 2 19:07:02 2017<br>
> New Revision: 301990<br>
><br>
> URL: <a
href="http://llvm.org/viewvc/llvm-project?rev=301990&view=rev"
rel="noreferrer" target="_blank"
moz-do-not-send="true">http://llvm.org/viewvc/llvm-project?rev=301990&view=rev</a><br>
> Log:<br>
> [PowerPC, DAGCombiner] Fold a
<< (b % (sizeof(a) * 8)) back to a
single instruction<br>
><br>
> Summary:<br>
> This is the corresponding llvm
change to D28037 to ensure no
performance<br>
> regression.<br>
><br>
> Reviewers: bogner, kbarton,
hfinkel, iteratee, echristo<br>
><br>
> Subscribers: nemanjai, llvm-commits<br>
><br>
> Differential Revision: <a
href="https://reviews.llvm.org/D28329"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://reviews.llvm.org/D28329</a><br>
><br>
> Modified:<br>
>
llvm/trunk/include/llvm/Target/TargetLowering.h<br>
>
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>
>
llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h<br>
>
llvm/trunk/test/CodeGen/PowerPC/shift_mask.ll<br>
><br>
> Modified:
llvm/trunk/include/llvm/Target/TargetLowering.h<br>
> URL: <a
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetLowering.h?rev=301990&r1=301989&r2=301990&view=diff"
rel="noreferrer" target="_blank"
moz-do-not-send="true">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetLowering.h?rev=301990&r1=301989&r2=301990&view=diff</a><br>
>
==============================================================================<br>
> ---
llvm/trunk/include/llvm/Target/TargetLowering.h
(original)<br>
> +++
llvm/trunk/include/llvm/Target/TargetLowering.h
Tue May 2 19:07:02 2017<br>
> @@ -2061,6 +2061,14 @@ public:<br>
> return false;<br>
> }<br>
><br>
> + // Return true if the
instruction that performs a << b
actually performs<br>
> + // a << (b % (sizeof(a) *
8)).<br>
> + virtual bool
supportsModuloShift(ISD::NodeType Inst,
EVT ReturnType) const {<br>
> + assert((Inst == ISD::SHL ||
Inst == ISD::SRA || Inst == ISD::SRL)
&&<br>
> + "Expect a shift
instruction");<br>
> + return false;<br>
> + }<br>
> +<br>
>
//===--------------------------------------------------------------------===//<br>
> // Runtime Library hooks<br>
> //<br>
><br>
> Modified:
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>
> URL: <a
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=301990&r1=301989&r2=301990&view=diff"
rel="noreferrer" target="_blank"
moz-do-not-send="true">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=301990&r1=301989&r2=301990&view=diff</a><br>
>
==============================================================================<br>
> ---
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
(original)<br>
> +++
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Tue May 2 19:07:02 2017<br>
> @@ -5294,6 +5294,17 @@ SDValue
DAGCombiner::visitSHL(SDNode *N)<br>
> }<br>
> }<br>
><br>
> + // If the target supports
masking y in (shl, y),<br>
> + // fold (shl x, (and y, ((1
<< numbits(x)) - 1))) -> (shl
x, y)<br>
> + if
(TLI.isOperationLegal(ISD::SHL, VT)
&&<br>
> +
TLI.supportsModuloShift(ISD::SHL, VT)
&& N1->getOpcode() ==
ISD::AND) {<br>
> + if (ConstantSDNode *Mask =
isConstOrConstSplat(N1->getOperand(1)))
{<br>
> + if (Mask->getZExtValue()
== OpSizeInBits - 1) {<br>
> + return
DAG.getNode(ISD::SHL, SDLoc(N), VT, N0,
N1->getOperand(0));<br>
> + }<br>
> + }<br>
> + }<br>
> +<br>
> ConstantSDNode *N1C =
isConstOrConstSplat(N1);<br>
><br>
> // fold (shl c1, c2) ->
c1<<c2<br>
> @@ -5492,6 +5503,17 @@ SDValue
DAGCombiner::visitSRA(SDNode *N)<br>
> EVT VT = N0.getValueType();<br>
> unsigned OpSizeInBits =
VT.getScalarSizeInBits();<br>
><br>
> + // If the target supports
masking y in (sra, y),<br>
> + // fold (sra x, (and y, ((1
<< numbits(x)) - 1))) -> (sra
x, y)<br>
> + if
(TLI.isOperationLegal(ISD::SRA, VT)
&&<br>
> +
TLI.supportsModuloShift(ISD::SRA, VT)
&& N1->getOpcode() ==
ISD::AND) {<br>
> + if (ConstantSDNode *Mask =
isConstOrConstSplat(N1->getOperand(1)))
{<br>
> + if (Mask->getZExtValue()
== OpSizeInBits - 1) {<br>
> + return
DAG.getNode(ISD::SRA, SDLoc(N), VT, N0,
N1->getOperand(0));<br>
> + }<br>
> + }<br>
> + }<br>
> +<br>
> // Arithmetic shifting an
all-sign-bit value is a no-op.<br>
> if (DAG.ComputeNumSignBits(N0)
== OpSizeInBits)<br>
> return N0;<br>
> @@ -5650,6 +5672,17 @@ SDValue
DAGCombiner::visitSRL(SDNode *N)<br>
> EVT VT = N0.getValueType();<br>
> unsigned OpSizeInBits =
VT.getScalarSizeInBits();<br>
><br>
> + // If the target supports
masking y in (srl, y),<br>
> + // fold (srl x, (and y, ((1
<< numbits(x)) - 1))) -> (srl
x, y)<br>
> + if
(TLI.isOperationLegal(ISD::SRL, VT)
&&<br>
> +
TLI.supportsModuloShift(ISD::SRL, VT)
&& N1->getOpcode() ==
ISD::AND) {<br>
> + if (ConstantSDNode *Mask =
isConstOrConstSplat(N1->getOperand(1)))
{<br>
> + if (Mask->getZExtValue()
== OpSizeInBits - 1) {<br>
> + return
DAG.getNode(ISD::SRL, SDLoc(N), VT, N0,
N1->getOperand(0));<br>
> + }<br>
> + }<br>
> + }<br>
<br>
I see three problems here:<br>
<br>
1. It needs to be documented.<br>
</blockquote>
<div><br>
</div>
<div>I wrote comments on the
transformation and the TLI hook. What
kind of documentation do you expect?</div>
</div>
</div>
</blockquote>
<br>
</div>
<div text="#000000" bgcolor="#FFFFFF"> A comment
in ISDOpcodes.h.</div>
<div text="#000000" bgcolor="#FFFFFF"><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex"> 2. This
isn't consistent with other transforms
done by DAGCombine, e.g.<br>
constant folding won't honor this.<br>
</blockquote>
<div><br>
</div>
<div>How so? Although it runs before
constant folding (e.g. (shl c1, c2)
-> c1<<c2), constant folding
will pick it up during the next
iteration, won't it?</div>
</div>
</div>
</blockquote>
<br>
</div>
<div text="#000000" bgcolor="#FFFFFF"> Suppose you
have "SHL(1, AND(y, 31))". We fold it to
"SHL(1, y)" because supportsModuloShift is
true. Then we figure out that y is 32. Then we
constant-fold "SHL(1, 32)" to zero (because
that's the default behavior for APInt).<br>
<br>
And that's only the one I thought of off the top
of my head; there are probably other transforms
which assume the shift amount is in range.</div>
</blockquote>
<div><br>
</div>
<div>Good point. I think we should write patterns in
.td files as you suggested.</div>
<div><br>
</div>
<div>The only problem I got is that, I don't know
how to write a working pattern for the following
DAG:</div>
<div><br>
</div>
<div>; %rem = and <16 x i8> %b, <i8 7, i8
7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7,
i8 7, i8 7, i8 7, i8 7, i8 7, i8 7></div>
<div>; %shl = shl <16 x i8> %a, %rem</div>
<div>
<div>Initial selection DAG: BB#0 'test010:'</div>
<div>SelectionDAG has 12 nodes:</div>
<div> t0: ch = EntryToken</div>
<div> t2: v16i8,ch = CopyFromReg t0,
Register:v16i8 %vreg0</div>
<div> t4: v16i8,ch = CopyFromReg t0,
Register:v16i8 %vreg1</div>
<div> t6: v16i8 = BUILD_VECTOR
Constant:i8<7>, Constant:i8<7>,
Constant:i8<7>, Constant:i8<7>,
Constant:i8<7>, Constant:i8<7>,
Constant:i8<7>, Constant:i8<7>,
Constant:i8<7>, Constant:i8<7>,
Constant:i8<7>, Constant:i8<7>,
Constant:i8<7>, Constant:i8<7>,
Constant:i8<7>, Constant:i8<7></div>
<div> t7: v16i8 = and t4, t6</div>
<div> t8: v16i8 = shl t2, t7</div>
<div> t10: ch,glue = CopyToReg t0, Register:v16i8
%V2, t8</div>
<div> t11: ch = PPCISD::RET_FLAG t10,
Register:v16i8 %V2, t10:1</div>
</div>
<div><br>
</div>
<div>I tried the patterns:</div>
<div>
<div>
<div>def : Pat<(v16i8 (shl v16i8:$vA,</div>
<div> (and v16i8:$vB,</div>
<div> (v16i8 (build_vector
(i8 7), (i8 7), (i8 7), (i8 7),</div>
<div>
(i8 7), (i8 7), (i8 7), (i8 7),</div>
<div>
(i8 7), (i8 7), (i8 7), (i8 7),</div>
<div>
(i8 7), (i8 7), (i8 7), (i8 7)))))),</div>
<div> (v16i8 (VSLB $vA, $vB))>;</div>
</div>
</div>
<div><br>
</div>
<div>and</div>
<div>
<div>def : Pat<(v16i8 (shl v16i8:$vA, (and
v16i8:$vB, (v16i8 7)))),</div>
<div> (v16i8 (VSLB $vA, $vB))>;</div>
</div>
<div><br>
</div>
<div>but they don't match the code above.</div>
<div><br>
</div>
<div>Do you see anywhere wrong in my pattern?</div>
</div>
</div>
</blockquote>
<br>
</div>
<div text="#000000" bgcolor="#FFFFFF"> Are you sure you're
trying to match the right pattern? Try looking at the
"Optimized legalized selection DAG". (It looks like
legalization inserts some bitcasts.)</div>
</blockquote>
<div><br>
</div>
<div>I prefer not to pattern match legalized DAG, since as you
noticed, there are bitcasts. Ideally we only want to match
the high level representation.</div>
<div><br>
</div>
<div>That said, I tried matching legalized v4i32 and v8i16,
they all work. However, for v2i64 in particular, PPC
legalization inserts a load from constant pool, which is
very unpleasant to match:</div>
<div><br>
</div>
<div>
<div>Legalized selection DAG: BB#0 'test010:'</div>
<div>SelectionDAG has 19 nodes:</div>
<div> t0: ch = EntryToken</div>
<div> t2: v2i64,ch = CopyFromReg t0, Register:v2i64
%vreg0</div>
<div> t4: v2i64,ch = CopyFromReg t0,
Register:v2i64 %vreg1</div>
<div> t12: v4i32 = bitcast t4</div>
<div> t23: i64,ch =
PPCISD::TOC_ENTRY<LD8[GOT]>
TargetConstantPool:i64<<2 x i64> <i64 63, i64
63>> 0, Register:i64 %X2</div>
<div> t19: v2f64,ch =
load<LD16[ConstantPool]> t0, t23, undef:i64</div>
<div> t20: v2i64 = bitcast t19</div>
<div> t13: v4i32 = bitcast t20</div>
<div> t14: v4i32 = and t12, t13</div>
<div> t15: v2i64 = bitcast t14</div>
<div> t8: v2i64 = shl t2, t15</div>
<div> t10: ch,glue = CopyToReg t0, Register:v2i64 %V2, t8</div>
<div> t11: ch = PPCISD::RET_FLAG t10, Register:v2i64 %V2,
t10:1</div>
</div>
<div><br>
</div>
<div>Is it possible to write a pattern for a pre-legalization
DAG? If not, I think we can still do it in the combiner, but
instead of combining `(shl a, (and b, 31))` to `(shl a, b)`,
we can combine it to `(PPCISD::VEC_SHL a, b)`. What do you
think?</div>
<div> </div>
</div>
</div>
</blockquote>
<br>
Patterns only apply after legalization. You can write a
target-specific DAGCombine, like you've outlined.<br>
<br>
-Eli<br>
<pre class="moz-signature" cols="72">--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
</body>
</html>