[PATCH] D75233: [LoopTerminology] LCSSA Form

Sun Mar 1 11:47:02 PST 2020

Meinersbur added inline comments.

================
Comment at: llvm/docs/LoopTerminology.rst:211-212
+This form is ensured by the LCSSA (:ref:`-loop-simplify <passes-lcssa>`)
+pass and is added automatically by the LoopPassManager when
+scheduling a LoopPass.
+After the loop optimizations are done, these extra phi nodes
----------------
baziotis wrote:
> Meinersbur wrote:
> > Note that this applies to the new pass manager only (by `FunctionToLoopPassAdaptor`): This does not happen with the legacy (current default) pass manager.
> Ok, I'm not very familiar with pass managers. So, in the old pass manager, you have to invoke LCSSA automatically or something?
In the legacy pass manger, those passes are added as dependency of the loop passes. The Helper function [[ https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Utils/LoopUtils.cpp#L141 | `getLoopAnalysisUsage` ]] add them  them as requirements and should be called by all loop passes in `getAnalysisUsage`. Might be useful to mention that.

================
Comment at: llvm/docs/LoopTerminology.rst:248-251
+If we did not have Loop Closed SSA form, we would have had to
+do deep analysis of the control flow graph to figure out where
+to place the X4 phi node.  With it, we just loop for these
+PHIs and update them.
----------------
baziotis wrote:
> Meinersbur wrote:
> > For the placement of the PHINode, LCSSA needs to do the deep analysis (finding the dominating exit block) as well. LCSSA makes it a two-step process. I don't think it would be complicated to do it in a single step on-demand when unswitching. There are other advantages.
> > 
> > LLVM uses a linked linked list to list all the uses of an instruction. There is a convenience function `replaceAllUsesWith` (or short: RAUW) does this replacement. For loops however, it had to distinguish between inside/outside users (1), but this is no different that when versioning a non-loop BasicBlock. For the placement of the PHINode, LCSSA needs to do the deep analysis (finding the exit block) as well.
> > 
> > I think the primary reason is that loop analysis become more independent (3). Say we perform a loop analysis on all loops in a function that stores reference to `llvm::Value`s used in the loops, such as ScalarEvolution. In a second step, we transform all loops using the analysis' result which does RAUW. Without LCSSA it might replace values in other loops and invalidate the analysis already applied to the. With LCSSA, only value of the PHINode in the exit block is changed, but the PHINode is the same instance used in other loops. However, if we have loop transformations that transform inner as well as outer loops, we still need to handle the case that transforming the inner (respectively outer) loop may invalidate an analysis on the other.
> > 
> > Another reason is that with LCSSA, `ScalarEvolution::getSCEV` is sufficient (2). Otherwise, one needs to use `getSCEVAtScope` to specify whether we are using the SCEV inside or outside. Polly does not require LCSSA, hence uses mostly `getSCEVAtScope`.
> > 
> > GCC has documentation about the advantages of LCSSA: https://gcc.gnu.org/onlinedocs/gccint/LCSSA.html . The numbers (1-3) correspond to the bullet list.
> Thank you for the explanation! Some questions / comments:
> 
> > For the placement of the PHINode, LCSSA needs to do the deep analysis (finding the dominating exit block) as well. LCSSA makes it a two-step process. I don't think it would be complicated to do it in a single step on-demand when unswitching. There are other advantages.
> 
> That was actually [[ http://nondot.org/sabre/LLVMNotes/LoopOptimizerNotes.txt | a note from Chris Lattner ]] (I couldn't have thought that haha). What I understood isn't that we can avoid deep analysis completely. Rather, deep analysis will be done for LCSSA. But then, loop optimizations have to preserve LCSSA (which from what I understand should be relatively easy), so when the time comes to do loop unswitching, LCSSA is there and helps you do it quick.
> 
> > There is a convenience function replaceAllUsesWith (or short: RAUW) does this replacement.
> 
> "this replacement" being loop unswitching ?
> 
> >  For loops however, it had to distinguish between inside/outside users (1), but this is no different that when versioning a non-loop BasicBlock
> 
> Why exactly the distinction ? What I understand is that, when doing RAUW on a basic block or a bunch of basic blocks (or just one), no matter if they're a loop or not, if you know which values are live outside them (e.g. using single entry PHI nodes), you make your life easier. Because you replace the inside uses and those on the PHI nodes on the boundary. Compared to if you didn't have those,
> where you would have to replace the uses across the whole function.
> 
> > Without LCSSA it might replace values in other loops and invalidate the analysis already applied to the. With LCSSA, only value of the PHINode in the exit block is changed, but the PHINode is the same instance used in other loops
> 
> Wow, that was very smart.
> 
> > Another reason is that with LCSSA, ScalarEvolution::getSCEV is sufficient (2). Otherwise, one needs to use getSCEVAtScope to specify whether we are using the SCEV inside or outside.
> 
> I'm not sure I got that completely (I'm not familiar with the interface of SCEV in LLVM). I guess because no instruction of the loop is used outside of it, so SCEV for a loop can be limited to the loop. Otherwise, a value may be used wherever and you have to include it in the scope if you are to do SCEV correctly.
> 
Since I haven't implemented a loop unswitch pass myself, I can't really argue how difficult it it, so I don't request to change it. 

Interestingly, the [[ https://gcc.gnu.org/ml/gcc-patches/2004-03/msg02212.html | source referenced ]] by the note just mentions it as "non-trivial update". It also adds another advantage of LCSSA: Induction variable uses in nested loop. Although SCEV should have no problem with, it's just two AddRecExpr in an SECExpr.

Maybe you could add the other advantages as well?

> "this replacement" being loop unswitching ?

Replacing uses of an instruction result outside the loop by the PHINode that 'combines' the result of the unswitched loops.

> Why exactly the distinction ? 

It's surely fewer values to replace (one in the PHI changes all uses outside the loop).  Whether a value is used outside can be checked using `L->contains(Use)` IMHO this does not justify LCSSA which searches and replaces all outside uses with the PHI unconditionally.

> I'm not sure I got that completely (I'm not familiar with the interface of SCEV in LLVM).

The SCEV for inside the loop references the induction variable in an AddRec expression. This is only valid to expand inside the loop. When outside the loop, you need the expression representing the exit value of the loop, or SCEVUnkown when the exit value cannot be computed, e.g. because the loop exit condition is not known.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75233/new/

https://reviews.llvm.org/D75233