[LLVMdev] Partial loop unrolling

Tue Jul 15 05:08:07 PDT 2014

Hello Mahesha,

Well, if I had to guess, I would say the third option is the best one. You
see, loop unrolling's main goal (as I see it) is not to generate a better
code, but to expose the potential of the code to other kinds of
optimizations within the compiler* -- increasing the size of the code in
the process. If you consider that, the first two options ended up
increasing the size of the program, but any new simplifications or
optimizations to the code will be applied to a part of it that runs only
once -- notice that the code inside the loop is, potentially, the one that
executes most times, but the only difference between the original code and
those ones is found outside the loop.

* There is, in fact, another interesting function for unrolling: if the
upper limit of the loop is known during compilation-time and it is a very
small value, it could be interesting to substitute the whole loop for all
the necessary calls to do_foo -- this way, you can remove all the
loop-related code from your program (conditions, variables, operations over
the induction variable, etc), and still have new opportunities to optimize
the code.

Hope I could help,

--
Cristianno Martins
PhD Student of Computer Science
University of Campinas
cmartins at ic.unicamp.br
 <cristiannomartins at hotmail.com>

On Tue, Jul 15, 2014 at 8:34 AM, Mahesha S <mahesha.llvm at gmail.com> wrote:

> Hi,
>
> PS: It is a generic question related to partial loop unrolling, and
> nothing specific to LLVM.
>
> As far as partial loop unrolling is concerned, I could see following three
> different possibilities. Assume that unroll factor is 3.
>
> Original loop:
>       for (i = 0; i < 10; i++)
>       {
>            do_foo(i);
>       }
>
> 1. First possibility
>       i = 0;
>       do_foo(i++);
>       do_foo(i++);
>       do_foo(i++);
>       for (; i < 10; i++)
>       {
>            do_foo(i);
>       }
>
> 2. Second possibility
>       for (i = 0; i < 7; i++)
>       {
>            do_foo(i);
>       }
>       do_foo(i++);
>       do_foo(i++);
>       do_foo(i++);
>
> 3. Third possibility
>       for (i = 0; i < 10;)
>       {
>            do_foo(i++);
>            do_foo(i++);
>            do_foo(i++);
>       }
>
> My questions are:
>
> a. In case, if we get any performance improvement due to partial loop
> unrolling, are all the three possibilities give almost same performance
> improvement?
>
> b. If answer to question 'a' is 'no', then which one of these three
> possibilities is ideal for generic partial unrolling implementation? and in
> general, which one of these is implemented in production compilers? and in
> particular, which one of these is implemented in LLVM compiler?
>
>
> --
> mahesha
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140715/870bf54b/attachment.html>