<div dir="ltr"><div><div><div>Correct me if I'm wrong, is the code trying to expand geometric series into polynomial expression? (that doesn't sound like a very good thing to do, it would spend so much time computing coefficients)<br></div><div><br>Wouldn't this be better to transform as below:<br></div><div>mult *= n++ // 30 times<br></div><div><br></div><div>; into<br></div>mult = mult (n)...(n+29)<br></div> = mult 1...(n+29)/((1)...(n-1)) <br> = mult * geo_sum(n+29) / geo_sum(n-1) ; is this more expensive than leaving as geo series?<br><br></div><div>Regards,<br></div><div>Kevin<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 3 August 2016 at 11:46, Huang, Li1 via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div link="#0563C1" vlink="#954F72" lang="EN-US">
<div>
<p class="MsoNormal">Hi,<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">I’m working on a slow-compile problem caused by SCEV (PR28830), and I need your suggestions on how to fix it. The loop below causes ScalarEvolution::getMulExpr to hang.
<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">int get(unsigned n)<u></u><u></u></p>
<p class="MsoNormal">{<u></u><u></u></p>
<p class="MsoNormal"> unsigned i, j, mult = 1;<u></u><u></u></p>
<p class="MsoNormal"> for (i = 0; i < 1; i++) {<u></u><u></u></p>
<p class="MsoNormal"> for (j = 0; j < 30; j++) {<u></u><u></u></p>
<p class="MsoNormal"> mult *= n++;<u></u><u></u></p>
<p class="MsoNormal"> }<u></u><u></u></p>
<p class="MsoNormal"> }<u></u><u></u></p>
<p class="MsoNormal"> return mult;<u></u><u></u></p>
<p class="MsoNormal">}<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">the inner loop is completed unrolled (by 30) to a long chain of “add” and “mult” instructions, then indvars follows loop-unroll and triggers SCEV creations for this long instruction chain. When it comes to getMulExpr, it tries to fold multiplications
of AddRecs of the same induction variable recursively, this could be very slow if the expression tree is deep.
<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Code responsible for this problem starts here (in ScalarEvolution.cpp):<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"> // Okay, if there weren't any loop invariants to be folded, check to see if<u></u><u></u></p>
<p class="MsoNormal"> // there are multiple AddRec's with the same loop induction variable being<u></u><u></u></p>
<p class="MsoNormal"> // multiplied together. If so, we can fold them.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"> // {A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L><u></u><u></u></p>
<p class="MsoNormal"> // = {x=1 in [ sum y=x..2x [ sum z=max(y-x, y-n)..min(x,n) [<u></u><u></u></p>
<p class="MsoNormal"> // choose(x, 2x)*choose(2x-y, x-z)*A_{y-z}*B_z<u></u><u></u></p>
<p class="MsoNormal"> // ]]],+,...up to x=2n}.<u></u><u></u></p>
<p class="MsoNormal"> // Note that the arguments to choose() are always integers with values<u></u><u></u></p>
<p class="MsoNormal"> // known at compile time, never SCEV objects.<u></u><u></u></p>
<p class="MsoNormal"> //<u></u><u></u></p>
<p class="MsoNormal"> // The implementation avoids pointless extra computations when the two<u></u><u></u></p>
<p class="MsoNormal"> // addrec's are of different length (mathematically, it's equivalent to<u></u><u></u></p>
<p class="MsoNormal"> // an infinite stream of zeros on the right).<u></u><u></u></p>
<p class="MsoNormal"> bool OpsModified = false;<u></u><u></u></p>
<p class="MsoNormal"> for (unsigned OtherIdx = Idx+1;<u></u><u></u></p>
<p class="MsoNormal"> OtherIdx != Ops.size() && isa<SCEVAddRecExpr>(Ops[OtherIdx]);<u></u><u></u></p>
<p class="MsoNormal">...<u></u><u></u></p>
<p class="MsoNormal">...<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">One way I thought about to fix this is to have a threshold for the number of terms in each AddRec when folding them. For example, we only fold AddRec1 * AddRec2 when (#terms in AddRec1 + #terms in AddRec2) < threshold. I tried setting
the threshold to 8, and it stops early for this case. Another way is to limit the depth of this folding, but this might require change of API.
<u></u><u></u></p>
</div>
</div>
<br>_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
<br></blockquote></div><br></div>