[PATCH] D9151: Loop Versioning for LICM

Tue Jan 12 23:37:15 PST 2016

ashutosh.nema added inline comments.

================
Comment at: lib/Transforms/IPO/PassManagerBuilder.cpp:349
@@ -342,1 +348,3 @@
+    MPM.add(createLICMPass());                  // Hoist loop invariants
+  }
   if (!DisableUnitAtATime && OptLevel > 1 && !PrepareForLTO) {
----------------
joker.eph wrote:
> grosser wrote:
> > joker.eph wrote:
> > > grosser wrote:
> > > > joker.eph wrote:
> > > > > ashutosh.nema wrote:
> > > > > > joker.eph wrote:
> > > > > > > Why is is done here and not where LICM is usually done? Line 419?
> > > > > > > Did you try multiple placements? Do you have benchmarks results that drives this choice?
> > > > > > We want to run this when inining is over, because after that we may see more accurate aliasing.
> > > > > > That’s why we placed it after ‘createBarrierNoopPass’. If we do this earlier probably of no-alias 
> > > > > > aliasing assumptions in version loop, other optimizations may get some benefit. 
> > > > > > 
> > > > > > This can be schedule later as well but I don’t see any benefit.
> > > > > > 
> > > > > > We have observed good gains in our internal benchmarks, mostly based on customer workloads.
> > > > > I'm not sure I follow why you need the inliner to be fully complete over the whole module and not having processed the function on which you want to run LICM. What kind of aliasing will this avoid? Do you have an example?
> > > > > 
> > > > > Also, you are saying that you don't see any benefit of scheduling this later, can you elaborate with respect to the other run of LICM I pointed?
> > > > > 
> > > > > Also, I ask about benchmark with respect to the different possible placement for this pass, and you didn't answer this in particular.
> > > > > 
> > > > One reason to run this late is that too early versioning may prevent  both further inlining (due to increased code size) as well as versioning later a possibly larger loop.
> > > > 
> > > > Furthermore, to my understanding the inlining loop is indeed a canonicalization phase. We are planning to run Polly after the inliner loop for this very same reason.
> > > Fair for the inliner loop. But what about the already existing LICM that already runs a little bit later?
> > Are  you talking about the LICM pass within the inliner loop? It is to my understanding part of the canonicalization sequence that helps both to eliminate loops (in case all statements can be proven loop invariant) and which also helps SCEV by moving certain SCEVUnknowns further out of the tree. As the existing LICM does not add a large code size increase, it seems to be save to be run in the inliner loop.
> > 
> > (LICM is not a pure canonicalization hence it causes some troubles for Polly, but we are working on addressing them on our side).
> No, not within the inliner loop, see line 419 (or 411 before the patch)
We did tried multiple placements i.e. prior to inlining completion and post completion.
As this is an internal benchmark, we can’t disclose details. But with these placements we 
did not noticed any degrade in our regular testing. Reason for running just after inlining 
completion is other passes might gain from non-alias version of loop (As of now we do 
not have any case where other optimization getting benefits from no-alias version of loop).


Repository:
  rL LLVM

http://reviews.llvm.org/D9151