[PATCH] D28694: Add pass insertion point EP_BeforeLoopIdiom

Fri Jan 20 01:08:12 PST 2017

chandlerc added a comment.

Sorry I didn't notice this waiting on me, and thanks for the gentle prod Krzysztof! My two cents inline.

================
Comment at: lib/Transforms/IPO/PassManagerBuilder.cpp:320
+  addExtensionsToPM(EP_LoopIdiom, MPM);
+
   MPM.add(createLoopIdiomPass());             // Recognize idioms like memset.
----------------
kparzysz wrote:
> mehdi_amini wrote:
> > hfinkel wrote:
> > > mehdi_amini wrote:
> > > > This extension point could get its own separate patch.
> > > > 
> > > > It seems very targeted, if the only point is to have TargetSpecific loop idiom recognition, then why aren't some TTI implemented and used in LoopIdiomPass?
> > > I thought about suggesting this, but didn't because:
> > > 
> > >  a. It is not clear what from the existing loop idiom recognition pass could be reused
> > >  b. The target-specific logic might want access to other analyses, and that would be awkward if it is just a TTI callback.
> > > 
> > > What do you think?
> > > 
> > These are good points!
> > 
> > > It is not clear what from the existing loop idiom recognition pass could be reused
> > 
> > Main point here: the insertion point in the pipeline :)
> > 
> > >  The target-specific logic might want access to other analyses, and that would be awkward if it is just a TTI callback.
> > 
> > No sure what you mean by "awkward" with a TTI needing analysis, but that would definitely be a problem if the TTI would want an analysis that is not generic or not currently available in LoopIdiom.
> > 
> > The reason I think discussing this extension point in its own revision is because I think we usually have "generic" and "coarse grain" extension points (do we have another case of specific hook like that?).
> > I'm cautious about it because adding such hooks makes it harder to evolve/maintain the pipeline (ultimately what when we'll have a hook before/after every single pass?).
> > I'm cautious about it because adding such hooks makes it harder to evolve/maintain the pipeline (ultimately what when we'll have a hook before/after every single pass?).
> 
> That's an understandable concern, but I think that this particular case is warranted, since it is not uncommon for targets to have their optimized versions of certain loop-based operations.  It can be either via target-specific runtime libraries, or directly via hardware instructions.
> Since this insertion point is shared with that of the target-independent loop idiom recognizer, the implementations would also share the same ways of handling the code.  That would make it easier to move code from a target-specific version to the target-independent pass, or conversely, to take a target-independent code and adopt it for the needs of a given target.
I think this extension point makes sense, and I wouldn't even tie it to the loop idiom recognition but instead to the position in the loop pipeline.

Specifically, I'd add a "late loop canonicalization and simplification" extension point. I would put it after the normal loop idiom but before loop deletion. I think documenting it as "the last thing in the loop pipeline before deletion" makes sense. You can talk about the fact that this is where you should extend the pipeline with loop transformations that tend to *remove loops entirely*.

You should also make it really clear that these need to be *loop passes* and are run in a pipeline across each loop in the nest. And ideally get some assert failure in the extension mechanism that enforces this. Having this accidentally add a function pass will really dramatically change the loop pipeline by partitioning deletion from the rest.

And as Mehdi said, I would land this as its own patch, and then to the TargetMachine patch separately.

And just to clarify, I would be very concerned about a TTI callback for the reasons Krzysztof mentioned... Especially the needing another analysis. I just think it'll get awkward fast and injecting passes is pretty clean and testable.

================
Comment at: tools/opt/opt.cpp:292-301
     Builder.addExtension(
         PassManagerBuilder::EP_EarlyAsPossible,
         [&](const PassManagerBuilder &, legacy::PassManagerBase &PM) {
           TM->addEarlyAsPossiblePasses(PM);
         });
+    Builder.addExtension(
+        PassManagerBuilder::EP_BeforeLoopIdiom,
----------------
If we go this way, I worry we're just bugging for a third...

What about instead you call a method on TargetMachine with the builder that can add any extensions it wants. Then all of this gets sunk into the TM. You can subsume both early-as-possible and the late-loop extensions into this one API. Does that make sense to others?

Repository:
  rL LLVM

https://reviews.llvm.org/D28694