[PATCH] D22630: Loop rotation

Wed Jul 27 12:53:59 PDT 2016

sebpop added inline comments.

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:319
@@ +318,3 @@
+  // Collect all nodes of the loop from header to latch.
+  BasicBlock *NewH = collectSEMEBlocks(OrigH, OrigLatch, Blocks, Exits);
+  assert(NewH);
----------------
eli.friedman wrote:
> sebpop wrote:
> > eli.friedman wrote:
> > > sebpop wrote:
> > > > hiraditya wrote:
> > > > > eli.friedman wrote:
> > > > > > Is it possible this will clone more code?  The trunk version would only clone the header; it looks like this will clone the whole loop body excluding the latch block?
> > > > > Yes, this will clone all the exiting basic blocks excluding the latch block (starting from the header until all it finds a BB which is not exiting). This is required to expose the redundancies that could be removed as a result of cloning the loop body. Currently there is a flag to control how much we want to clone (RotationMaxSize), I can add more specific flags to control how many exiting blocks we want to copy.
> > > > > In general, it would be a good idea to clone all the exiting basic blocks to expose all the redundancies.
> > > > > 
> > > > > until all it finds a BB which is not exiting
> > > > To rotate the loop from Michael (loop with if-then-else followed by loop exit) we could avoid stopping at the first BB which is not exiting the loop, and to continue until all loop exit edges have been duplicated.
> > > > 
> > > This is why I'd really like to see this patch split... it would be much easier to have this discussion with a patch which is specifically "Clone more blocks in loop rotation", so we could discuss what exactly makes sense independently of the changes to the cloning algorithm.
> > You mean let's commit this patch as is and then extend it to catch more loops?
> > Or do you mean we should restrict the new codegen algorithm to only catch the loops that are currently rotated?
> > Can you please be a bit more specific about which parts of this patch you would like to see split?
> I mean a patch with just the new algorithm, which behaves essentially the same way as the trunk loop rotation pass, with followups to change the rotation heuristics.  That makes it simpler to review because the expected behavior of each patch is more obvious, and it'll be easier to bisect any regressions.  As it is, I'm digging through a gigantic patch without any clue what changes are intentional.
To make the new code behave like the old one, we can add some code that restricts its functionality: only duplicate code until the first loop exit.
I do not find it productive to walk this road.

Please keep in mind that this is a rewrite of the existing algorithm: it is not a refactoring patch.
The original implementation of loop rotation cannot be moved in small steps to the new implementation.
A rhetorical question: when somebody comes up with a new pass, do you expect the new pass to be split up into small steps: here is the correctness analysis part, the heuristic, and code generation?  Well, no.  So why would it be different for this patch that throws away code that has too many restrictions and adds a new implementation of the pass?

There may be another road if you think that would be better: we could keep the current loop rotation as is, and add a new loop rotation pass in a different file, and then deprecate the existing pass once everybody is convinced that the new code is better.  Would that work for you?  That will not reduce the size of the current patch.

https://reviews.llvm.org/D22630