<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Oct 13, 2016 at 1:57 PM, Charith Mendis via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Oh I see, the original loop may end normally, but by unrolling it may induce an infinite loop.<div><div class="gmail-h5"><br></div></div></div></blockquote><div><br>No. The problem is that step of the original loop is not 1. <br><br> for(unsigned i=0; i<count; i+=4){<br> a[i] = b[i]*c[i-1];<br> }<br><br>Let's assume unsigned is a 32 bit integer. Then maximum unsigned number will be 2^32 - 1. Let count = 2^32 - 1. When the loop iterates at some point we will have i = 2^32 - 4. What is the value of i in the next iteration? 2^32 cannot be represented in a 32 bit integer, so what happens is that you will wrap around and in the next iteration you will have i = 0. So your original loop may be infinte. You can confirm this by compiling and running the following program<br><br>#include <climits><br>#include <iostream><br><br>using namespace std;<br><br>int main() {<br><br> for (int i = 0; i < UINT_MAX; i += 4) {<br> if (i == 0)<br> cout << "i == 0!" << endl;<br> }<br><br> return 0;<br>}<br><br><br><br><br><br><br><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div class="gmail-h5"><br>On Thursday, October 13, 2016, Alexandre Isoard <<a href="mailto:alexandre.isoard@gmail.com" target="_blank">alexandre.isoard@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">If count > MAX_UINT-4 your loop loops indefinitely with an increment of 4, I think.<br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Oct 13, 2016 at 4:42 PM, Charith Mendis via llvm-dev <span dir="ltr"><<a>llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">So, I tried unrolling the following simple loop.<div><br></div><div><p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo;color:rgb(187,44,162)"><span>int</span><span style="color:rgb(0,0,0)"> unroll(</span><span>unsigned</span><span style="color:rgb(0,0,0)"> * a, </span><span>unsigned</span><span style="color:rgb(0,0,0)"> * b, </span><span>unsigned</span><span style="color:rgb(0,0,0)"> *c, </span><span>unsigned</span><span style="color:rgb(0,0,0)"> count){</span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span> </span><span style="color:rgb(187,44,162)">for</span><span>(</span><span style="color:rgb(187,44,162)">unsigned</span><span> i=</span><span style="color:rgb(39,42,216)">0</span><span>; i<count; i++</span><span>){</span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span> a[i] = b[i]*c[i-</span><span style="color:rgb(39,42,216)">1</span><span>];</span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span> }</span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo;color:rgb(187,44,162)"><span style="color:rgb(0,0,0)"> </span><span>return</span><span style="color:rgb(0,0,0)"> </span><span style="color:rgb(39,42,216)">0</span><span style="color:rgb(0,0,0)">;</span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span>}</span></p><p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span><br></span></p><p style="margin:0px;line-height:normal">Then, the unroller is able to unroll it by 2 even though it doesn't know about the range of count. SCEV of backedge taken count is (-1 + %count)</p><p style="margin:0px;line-height:normal"><br></p><p style="margin:0px;line-height:normal">But, when I change the increment to 4, as in </p><p style="margin:0px;line-height:normal"><br></p><p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo;color:rgb(187,44,162)"><span>int</span><span style="color:rgb(0,0,0)"> unroll(</span><span>unsigned</span><span style="color:rgb(0,0,0)"> * a, </span><span>unsigned</span><span style="color:rgb(0,0,0)"> * b, </span><span>unsigned</span><span style="color:rgb(0,0,0)"> *c, </span><span>unsigned</span><span style="color:rgb(0,0,0)"> count<wbr>){</span></p><p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span> </span><span style="color:rgb(187,44,162)">for</span><span>(</span><span style="color:rgb(187,44,162)">unsigned</span><span> i=</span><span style="color:rgb(39,42,216)">0</span><span>; i<count; <span style="background-color:rgb(255,0,0)">i+=4</span></span><span>){</span></p><p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span> a[i] = b[i]*c[i-</span><span style="color:rgb(39,42,216)">1</span><span>];</span></p><p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span> }</span></p><p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo;color:rgb(187,44,162)"><span style="color:rgb(0,0,0)"> </span><span>return</span><span style="color:rgb(0,0,0)"> </span><span style="color:rgb(39,42,216)">0</span><span style="color:rgb(0,0,0)">;</span></p><p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span>}</span></p><p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span><br></span></p><p style="margin:0px;line-height:normal"> The unroller cannot compute the backedge taken count. Therefore, it seems like the problem is not with the range of "count", can't the unroller compute it as (- 1 + %count / 4)? </p></div></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Oct 12, 2016 at 11:28 PM, Charith Mendis <span dir="ltr"><<a>char.mendis1989@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Thanks for the explanation. But I am a little confused with the following fact. Can't LLVM keep vectorizable_elements as a symbolic value and convert the loop to say;<div><br></div><div>for(unsigned i = 0; i < vectorizable_elements ; i += 2){</div><div> //main loop</div><div>}</div><div><br></div><div>for(unsigned i=0 ; i < vectorizable_elements % 2; i++){</div><div> //fix up</div><div>}</div><div><br></div><div>Why does it have to reason about the range of vectorizable_elements? Even if vectorizable_elements == SIZE_MAX the above decomposition would work?</div></div><div class="gmail_extra"><div><div><br><div class="gmail_quote">On Wed, Oct 12, 2016 at 8:25 PM, Friedman, Eli <span dir="ltr"><<a>efriedma@codeaurora.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"><span>
<div>On 10/12/2016 4:35 PM, Charith Mendis
via llvm-dev wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi all,
<div><br>
</div>
<div>Attached herewith is a simple vectorized function with
loops performing a simple shuffle.</div>
<div><br>
</div>
<div>I want all loops (inner and outer) to be unrolled by 2 and
as such used -unroll-count=2</div>
<div>The inner loops(with k as the induction variable and having
constant trip counts) unroll fully, but the outer loop with
(j) fails to unroll. </div>
<div><br>
</div>
<div>The llvm code is also attached with inner loops fully
unrolled.</div>
<div><br>
</div>
<div>To inspect further, I added the following to the
PassManagerBuilder.cpp to run some canonicalization routines
and redo unrolling again. I have set partial unrolling on +
have a huge threshold + allows expensive loop trip counts.
Still it didn't unroll by 2.</div>
<div><br>
</div>
<div>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span>MPM.add(createLoopUnrollPass()<wbr>);</span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span>MPM.add(createCFGSimplificatio<wbr>nPass()); </span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span>MPM.add(createLoopSimplifyPass<wbr>()); </span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span>MPM.add(createLoopRotatePass(S<wbr>izeLevel
== </span><span style="color:rgb(39,42,216)">2</span><span> ? </span><span style="color:rgb(39,42,216)">0</span><span> : -</span><span style="color:rgb(39,42,216)">1</span><span>));</span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span>MPM.add(createLCSSAPass());</span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span>MPM.add(createIndVarSimplifyPa<wbr>ss());
</span><span style="color:rgb(0,132,0)">//
Canonicalize indvars</span></p>
<p style="margin:0px;font-size:11px;line-height:normal;font-family:menlo"><span>MPM.add(createLoopUnrollPass()<wbr>);</span></p>
</div>
<div><br>
</div>
<div><br>
</div>
<div>Digging deeper I found, that it fails in
UnrollRuntimeLoopRemainder function, where it is unable to
calculate the BackEdge taken amount.</div>
<div><br>
</div>
<div>Can anybody explain what is need to get the outer loop
unrolled by 2? It would be a great help.</div>
</div>
</blockquote>
<br></span>
Well, I can at least explain what is happening... runtime unrolling
needs to be able to symbolically compute the trip count to avoid
inserting a branch after every iteration. SCEV isn't able to prove
that your loop isn't an infinite loop (consider the case of
vectorizable_elements==SIZE_MA<wbr>X), therefore it can't compute the
trip count. Therefore, we don't unroll.<br>
<br>
There's a few different angles you could use to attack this: you
could teach the unroller to unroll loops with an uncomputable trip
count, or you can make the trip count of your loop computable
somehow. Changing the unroller is probably straightforward (see the
recently committed r284044). Making the trip count computable is
more complicated... it's probably possible to teach SCEV to reason
about the overflow in the pointer computation, or maybe you could
version the loop.<span><font color="#888888"><br>
<br>
-Eli<br>
<pre cols="72">--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
</font></span></div>
</blockquote></div><br><br clear="all"><div><br></div></div></div><span>-- <br><div><div dir="ltr"><div>Kind regards,<br>Charith Mendis<br><br>Graduate Student,<div>CSAIL,<br><div>Massachusetts Institute of Technology</div></div></div></div></div>
</span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div><div dir="ltr"><div>Kind regards,<br>Charith Mendis<br><br>Graduate Student,<div>CSAIL,<br><div>Massachusetts Institute of Technology</div></div></div></div></div>
</div>
</div></div><br>______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a>llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div><div dir="ltr"><b>Alexandre Isoard</b><br></div></div>
</div></div>
</blockquote>
</div></div></div>
<br>______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
<br></blockquote></div><br></div></div>