[PATCH] D75233: [LoopTerminology] LCSSA Form

Fri Feb 28 13:24:15 PST 2020

baziotis marked 2 inline comments as done.
baziotis added inline comments.

================
Comment at: llvm/docs/LoopTerminology.rst:211-212
+This form is ensured by the LCSSA (:ref:`-loop-simplify <passes-lcssa>`)
+pass and is added automatically by the LoopPassManager when
+scheduling a LoopPass.
+After the loop optimizations are done, these extra phi nodes
----------------
Meinersbur wrote:
> Note that this applies to the new pass manager only (by `FunctionToLoopPassAdaptor`): This does not happen with the legacy (current default) pass manager.
Ok, I'm not very familiar with pass managers. So, in the old pass manager, you have to invoke LCSSA automatically or something?

================
Comment at: llvm/docs/LoopTerminology.rst:248-251
+If we did not have Loop Closed SSA form, we would have had to
+do deep analysis of the control flow graph to figure out where
+to place the X4 phi node.  With it, we just loop for these
+PHIs and update them.
----------------
Meinersbur wrote:
> For the placement of the PHINode, LCSSA needs to do the deep analysis (finding the dominating exit block) as well. LCSSA makes it a two-step process. I don't think it would be complicated to do it in a single step on-demand when unswitching. There are other advantages.
> 
> LLVM uses a linked linked list to list all the uses of an instruction. There is a convenience function `replaceAllUsesWith` (or short: RAUW) does this replacement. For loops however, it had to distinguish between inside/outside users (1), but this is no different that when versioning a non-loop BasicBlock. For the placement of the PHINode, LCSSA needs to do the deep analysis (finding the exit block) as well.
> 
> I think the primary reason is that loop analysis become more independent (3). Say we perform a loop analysis on all loops in a function that stores reference to `llvm::Value`s used in the loops, such as ScalarEvolution. In a second step, we transform all loops using the analysis' result which does RAUW. Without LCSSA it might replace values in other loops and invalidate the analysis already applied to the. With LCSSA, only value of the PHINode in the exit block is changed, but the PHINode is the same instance used in other loops. However, if we have loop transformations that transform inner as well as outer loops, we still need to handle the case that transforming the inner (respectively outer) loop may invalidate an analysis on the other.
> 
> Another reason is that with LCSSA, `ScalarEvolution::getSCEV` is sufficient (2). Otherwise, one needs to use `getSCEVAtScope` to specify whether we are using the SCEV inside or outside. Polly does not require LCSSA, hence uses mostly `getSCEVAtScope`.
> 
> GCC has documentation about the advantages of LCSSA: https://gcc.gnu.org/onlinedocs/gccint/LCSSA.html . The numbers (1-3) correspond to the bullet list.
Thank you for the explanation! Some questions / comments:

> For the placement of the PHINode, LCSSA needs to do the deep analysis (finding the dominating exit block) as well. LCSSA makes it a two-step process. I don't think it would be complicated to do it in a single step on-demand when unswitching. There are other advantages.

That was actually [[ http://nondot.org/sabre/LLVMNotes/LoopOptimizerNotes.txt | a note from Chris Lattner ]] (I couldn't have thought that haha). What I understood isn't that we can avoid deep analysis completely. Rather, deep analysis will be done for LCSSA. But then, loop optimizations have to preserve LCSSA (which from what I understand should be relatively easy), so when the time comes to do loop unswitching, LCSSA is there and helps you do it quick.

> There is a convenience function replaceAllUsesWith (or short: RAUW) does this replacement.

"this replacement" being loop unswitching ?

>  For loops however, it had to distinguish between inside/outside users (1), but this is no different that when versioning a non-loop BasicBlock

Why exactly the distinction ? What I understand is that, when doing RAUW on a basic block or a bunch of basic blocks (or just one), no matter if they're a loop or not, if you know which values are live outside them (e.g. using single entry PHI nodes), you make your life easier. Because you replace the inside uses and those on the PHI nodes on the boundary. Compared to if you didn't have those,
where you would have to replace the uses across the whole function.

> Without LCSSA it might replace values in other loops and invalidate the analysis already applied to the. With LCSSA, only value of the PHINode in the exit block is changed, but the PHINode is the same instance used in other loops

Wow, that was very smart.

> Another reason is that with LCSSA, ScalarEvolution::getSCEV is sufficient (2). Otherwise, one needs to use getSCEVAtScope to specify whether we are using the SCEV inside or outside.

I'm not sure I got that completely (I'm not familiar with the interface of SCEV in LLVM). I guess because no instruction of the loop is used outside of it, so SCEV for a loop can be limited to the loop. Otherwise, a value may be used wherever and you have to include it in the scope if you are to do SCEV correctly.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75233/new/

https://reviews.llvm.org/D75233