[PATCH] D75013: [LoopTerminology] Rotated Loops

Wed Mar 4 12:57:09 PST 2020

baziotis marked an inline comment as done.
baziotis added a comment.

> IIUC, current LoopRotate only clones the loop header (i.e. the loop exit condition must be its terminator)

It might be good to mention that.

================
Comment at: llvm/docs/LoopTerminology.rst:195-196
+
+This transformation canonicalizes the loop latch to have
+a single successor, which implies that the loop latch
+is also an exiting block. It is done by the `loop-rotate`
----------------
Meinersbur wrote:
> baziotis wrote:
> > Meinersbur wrote:
> > > baziotis wrote:
> > > > Meinersbur wrote:
> > > > > baziotis wrote:
> > > > > > Meinersbur wrote:
> > > > > > > [serious] By definition, a latch has a backedge to the header. If the latch had just a single successor, there could not be another edge to outside the loop.
> > > > > > > 
> > > > > > > Using C code to describe the effect is higher level than the IR-level it is actually performed on. What a latch is is not obvious in the C code. LoopRotate also copies the header block, which might be interesting to mention.
> > > > > > > [serious] By definition, a latch has a backedge to the header. If the latch had just a single successor, there could not be another edge to outside the loop.
> > > > > > 
> > > > > > Yes, given that the latch contains the condition (as in a do-while loop), that was not so smart on my part. :P
> > > > > > But honestly, the 2 videos confused me in this part:
> > > > > > - On the one video it says that the latch has a single successor. I assume was a typo.
> > > > > > - On the other, it says that it has a single predecessor. Which, in the example given, I don't think is true. Besides that,
> > > > > > I don't see why having a single predecessor is important. What do I miss?
> > > > > > 
> > > > > > > Using C code to describe the effect is higher level than the IR-level it is actually performed on. What a latch is is not obvious in the C code. LoopRotate also copies the header block, which might be interesting to mention.
> > > > > > 
> > > > > > Yes, I'll change it to IR. How about if I put view-cfg-produced image?
> > > > > Could you point me to where in the videos this statement is made? It is totally possible that we make mistakes as well.
> > > > > 
> > > > > I also would not see why the latch would need to have a single predecessor.
> > > > > 
> > > > > > How about if I put view-cfg-produced image?
> > > > > 
> > > > > In image illustration would be nice.
> > > > > Could you point me to where in the videos this statement is made? It is totally possible that we make mistakes as well.
> > > > > I also would not see why the latch would need to have a single predecessor.
> > > > Of course, no bad intention. I'm not the best to identify the mistake anyway. :)
> > > > So, in the video [[ https://youtu.be/3pRhvQi7Z10?t=287 | Writing Loop Optimizations ]], at around 4:47, there's a slide mentioning the single predecessor.
> > > > Also, in the video [[ https://youtu.be/-JQr9aNagQo?t=330 | Loop Fusion, Loop Distribution and their Place in the Loop Optimization Pipeline ]], at around 5:31, there's a slide mentioning the single successor.
> > > > 
> > > > > In image illustration would be nice.
> > > > Great.
> > > The first presentation talks about a single predecessor and the second about single successor. I think it might refer to [[ https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp#L525 | this ]]:
> > > ```
> > >       // Preserve canonical loop form, which means that 'Exit' should have only
> > >       // one predecessor. Note that Exit could be an exit block for multiple
> > >       // nested loops, causing both of the edges to now be critical and need to
> > >       // be split.
> > > ```
> > > However, AFAIK it is not directly a requirement for either loop simplified or rotated form, at least it is not checked in `isLoopSimplfyForm` or `isRotatedForm`. That is, I think it is legal (but somewhat strange) in both canonical forms for have a switch in the latch with three cases, one going to the header and two going to (the same) exit block. [[ https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Utils/LoopSimplify.cpp#L604 | Looks like LoopSimplify even reverses this change ]].
> > > 
> > > Maybe @kbarton can elaborate what was meant.
> > Thanks for the references! The loop rotation excerpt was not very clear to me. I understood the LoopSimplify one, but seeing the bigger picture I don't understand the reasoning. I'll check tomorrow. Hopefully @kbarton can help.
> > Btw, reading through these, it seems we should add to the docs what is a deoptimizing edge and what is a critical edge.
> > It then might be clearer to read the commits: https://github.com/llvm/llvm-project/commit/2f6987ba61cc31c16c64f511e5cbc76b52dc67b3
> > and
> > https://github.com/llvm/llvm-project/commit/c8ca49659ac479b5be0349528d42179f40fc553b#diff-7430b177373b789cad7153ebb942f708R11
> > I understood the LoopSimplify one, but seeing the bigger picture I don't understand the reasoning.
> 
> ```
> for (int i = 0; i < n; ++i) {
>   auto v = *p;
>   use(v);
> }
> ```
> The value of `p` is loop-invariant, but the following transformation is illegal (to avoid repeatedly loading the same value from memory):
> ```
> auto v = *p;
> for (int i = 0; i < n; ++i) {
>   use(v);
> }
> ```
> because if `n==0`, the address in `p` is dereferenced while it was not in the original loop. `p` might be null and the transform code would segfault while the original code did not. Solution is to insert a check whether the loop body is executed at least once:
> ```
> if (0 < n) { // loop guard
>   auto v = *p;
>   for (int i = 0; i < n; ++i) {
>     use(v);
>   }
> }
> ```
> Now the loop condition before entering the first iteration is redundant, we already checked it in the guard. We can move it to after the first iteration:
> ```
> if (0 < n) { // loop guard
>   auto v = *p;
>   for (int i = 0; ; ++i) {
>     use(v);
>     if (i+1>=n)
>       break;
>   }
> }
> ```
> We just invented loop rotation.
> 
> The name 'rotation' comes from [[ https://i.imgur.com/Ju9nVVl.gif | gradually peeling/unrolling (with an infinite spiral) ]]) statements out of the loop until everything in the loop is executed at least once:
> ```
>   int i = 0;
>   while (1) {
>     foo(i);
>     bar(i);
>     if (i >= n) return;
>     ++i;
>     xyzzy(i);
>   }
> ```
>  🡆
> ```
>   int i = 0;
>   foo(i);
>   while (1) {
>     bar(i);
>     if (i >= n) return;
>     ++i;
>     xyzzy(i);
>     foo(i);
>   }
> ```
>  🡆
> ```
>   int i = 0;
>   foo(i);
>   bar(i);
>   while (1) {
>     if (i >= n) return;
>     ++i;
>     xyzzy(i);
>     foo(i);
>     bar(j);
>   }
> ```
>  🡆
> ```
>   int i = 0;
>   foo(i);
>   bar(i);
>   if (i >= n) return;
>   while (1) {
>     ++i;
>     xyzzy(i);
>     foo(i);
>     bar(i);
>     if (i >= n) return;
>   }
> ```
Wow, that was a very neat explanation, thank you! Along with this, I was able to understand most of the code that implements it.
Apart from the implementation details, what I understand is that the point of loop rotation is to hoist invariant instructions (not just loads) to the preheader (and it may fold some instructions along the way).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75013/new/

https://reviews.llvm.org/D75013