<div dir="ltr">Hello Mahesha,<div><br></div><div>Well, if I had to guess, I would say the third option is the best one. You see, loop unrolling's main goal (as I see it) is not to generate a better code, but to expose the potential of the code to other kinds of optimizations within the compiler* -- increasing the size of the code in the process. If you consider that, the first two options ended up increasing the size of the program, but any new simplifications or optimizations to the code will be applied to a part of it that runs only once -- notice that the code inside the loop is, potentially, the one that executes most times, but the only difference between the original code and those ones is found outside the loop.</div>
<div><br></div><div>* There is, in fact, another interesting function for unrolling: if the upper limit of the loop is known during compilation-time and it is a very small value, it could be interesting to substitute the whole loop for all the necessary calls to do_foo -- this way, you can remove all the loop-related code from your program (conditions, variables, operations over the induction variable, etc), and still have new opportunities to optimize the code.</div>
<div><br></div><div><br></div><div>Hope I could help,</div></div><div class="gmail_extra"><br clear="all"><div><br>--<br>Cristianno Martins<br>PhD Student of Computer Science<br>University of Campinas<br><a href="mailto:cmartins@ic.unicamp.br" target="_blank">cmartins@ic.unicamp.br</a><br>
<a href="mailto:cristiannomartins@hotmail.com" target="_blank"></a></div>
<br><br><div class="gmail_quote">On Tue, Jul 15, 2014 at 8:34 AM, Mahesha S <span dir="ltr"><<a href="mailto:mahesha.llvm@gmail.com" target="_blank">mahesha.llvm@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><div>Hi,</div><div><br></div><div>PS: It is a generic question related to partial loop unrolling, and nothing specific to LLVM.</div><div><br></div><div>As far as partial loop unrolling is concerned, I could see following three different possibilities. Assume that unroll factor is 3.<br>
</div><div><br></div><div>Original loop:</div><div> for (i = 0; i < 10; i++)</div><div> {</div><div> do_foo(i);</div><div> } </div><div><br></div><div>1. First possibility</div><div> i = 0;</div>
<div> do_foo(i++);</div><div> do_foo(i++);</div><div> do_foo(i++);</div><div> for (; i < 10; i++)<br></div><div> {</div><div> do_foo(i);</div><div> } </div><div><br></div><div>2. Second possibility</div>
<div><div> for (i = 0; i < 7; i++)</div><div> {</div><div> do_foo(i);</div><div> } </div></div><div> do_foo(i++);</div><div> do_foo(i++);</div><div> do_foo(i++);</div><div><br></div>
<div>3. Third possibility</div><div><div> for (i = 0; i < 10;)</div><div> {</div><div> do_foo(i++);</div><div> do_foo(i++);</div><div> do_foo(i++);</div><div> } </div></div>
<div><br></div><div>My questions are:</div><div><br></div><div>a. In case, if we get any performance improvement due to partial loop unrolling, are all the three possibilities give almost same performance improvement?</div>
<div><br></div><div>b. If answer to question 'a' is 'no', then which one of these three possibilities is ideal for generic partial unrolling implementation? and in general, which one of these is implemented in production compilers? and in particular, which one of these is implemented in LLVM compiler?</div>
<span class="HOEnZb"><font color="#888888">
<div><br></div><div><br></div>-- <br><div>mahesha</div>
</font></span></div>
<br>_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
<br></blockquote></div><br></div>