[LLVMdev] Question about optimizing mem in loop

Ryan Taylor ryta1203 at gmail.com
Thu Sep 13 17:00:05 PDT 2012


To clarify, I understand "what' is going on but I'd like to know why. Why
is seemingly pre-computing the GEPs and then creating the control flow
rather than doing it per iteration. Seems like a lot of code bloat for the
exact same core operations?

On Thu, Sep 13, 2012 at 4:55 PM, Ryan Taylor <ryta1203 at gmail.com> wrote:

> Is there a strong reason why this simple code:
>
>         for(rnd = 0; rnd < Nrnd - 1; ++rnd)
>         {
>             // round(inv_rnd, b1, b0, kp);
>
>          for (iter = 0; iter < 4; ++iter) {
>              round_i(inv_rnd, b1, b0, kp, iter);
>          }
>             l_copy(b0, b1); kp -= nc;
>         }
>
> Produces the complicated control flow logic in the attached CFG?
>
> If I unroll the loop I no longer have the crazy control flow logic. It
> seems that instead of calculating the GEPs one at a time inside the loop,
> it's pulling all 4 out of the inner loop into the outer loop head and then
> branching in the inner loop depending on which iteration this is in. I
> can't really think of a good reason to do it this way, I'm sure there is so
> I was hoping someone might explain why this is occuring (instead of simply
> looping over the round and calculating each GEP in each iteration depending
> on the index).
>
> Secondly, what's the best way to convince the compiler not to do this
> code/logic bloat? I'm pretty unclear how it's saving any cycles.
>
> If you look at the O2 CFG (the O3 is the same), it's creating a switch
> who's every branch ends up at the same BB, who's pred goes to itself and
> the BB to which the switch points and the switch BB only contains itself?
> If you look at the other CFG it's just run with default clang opts (ie
> clang source_file) and all it's doing is lowering the switch (the same
> problem still exists).
>
> Seems confusing to me?
>
> ps. Round has no control flow logic in it and when the loop is unrolled
> there is no control flow at all (uncond branching, etc). The logic gets
> even more convoluted when simplycfg and other opts are applied.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120913/f4a63d8e/attachment.html>


More information about the llvm-dev mailing list