[PATCH] D27198: [PM] Introduce the facilities for registering cross-IR-unit dependencies that require deferred invalidation.

Sat Dec 10 01:10:22 PST 2016

jlebar added inline comments.

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:103
   // entire SCC layer as well rather than trying to do invaliadtion ourselves.
-  if (!PA.preserved<CGSCCAnalysisManagerModuleProxy>() ||
+  if (!PA.preserved<CGSCCAnalysisManagerModuleProxy, AllAnalysesOn<Module>>() ||
       Inv.invalidate<LazyCallGraphAnalysis>(M, PA) ||
----------------
chandlerc wrote:
> jlebar wrote:
> > Hm, now that I see it being used, I am even less thrilled about this API.  It's not at all obvious what the second template argument means here, and it also seems super easy to forget to pass this the relevant arguments.  In addition, if I ever add a new abstract set, I have to go and modify every preserved() call.
> > 
> > Would it be out of the question to encapsulate within (say) the CGSCCAnalysisManagerModuleProxy type the sets that cover it, so that we could continue to pass only one type to preserved<...>()?
> I mean, I don't disagree with any of this, but I've not come up with a better alternative really.
> 
> I know there are going to be more sets than IR-unit derived ones such as CFG-preserving. =/ So bundling it inside the proxy doesn't seem like it'd be a great alternative... And it would still be quite hard to make work.
> 
> The key thing that needs to happen is that one layer needs to be able to introduce a preserved set for an IR unit, and then some other part of the code needs to subtract one analysis from that set, and then when we call 'invalidate' on *that* analysis it needs to not pay attention to the set.
> 
> Anyways, any better API ideas here are very, very welcome. =/
> Anyways, any better API ideas here are very, very welcome. =/

One idea was up earlier in the review:

> Could we require that these abstract sets inherit from some type [or otherwise have some way to tell the difference between an analysis and a set of analyses]?

At least then PA could catch some incorrect uses of its API.

(I understand that is solving a different problem than the one I was originally commenting on here.)

In terms of this problem, are you saying that we can neither

a) have an abstract analysis set enumerate its passes, nor can we
b) have an analysis enumerate its abstract analysis sets,

because the only layer that knows about all of the relevant sets and passes does not declare the sets or the passes?  If so this seems remarkably fragile, to the point that I would want to step back and consider whether the layering we've imposed is actually helpful -- that is, whether the design space is overconstrained.

TBH I am pretty concerned that nobody other than you and Sean is going to be smart enough to program this correctly.  For example, writing

   !PA.preserved<CGSCCAnalysisManagerModuleProxy, AllAnalysesOn<Module>>()

requires global knowledge of LLVM that the only analysis set that covers CGSCCAnalysisManagerModuleProxy is AllAnalysesOn<Module>.  (Or it somehow requires even more arcane knowledge that AllAnalysesOn<Module> is the only set that we need to enumerate here.)  If you get it wrong, things will mostly work, until they don't, so these are not going to be easy bugs to find.

My understanding is that a lot of the design complexity is motivated by a desire to allow outside-tree users to provide new kinds of PMs.  That's a laudable goal, but on average I would expect out-of-tree users to have less knowledge of LLVM core than your average core developer, so such an API is really only useful if it's hard to screw up.  If we can't make it hard for them to do the wrong thing, and if providing this loose-coupling mechanism adds substantial complexity to our internal design, I am personally not convinced we are making the right design tradeoffs.

================
Comment at: unittests/IR/PassManagerTest.cpp:431

+/// A test analysis pass which chaches in its result the result from the above
+/// indirect analysis pass.
----------------
chandlerc wrote:
> jlebar wrote:
> > silvas wrote:
> > > nit: avoid the terminology "analysis pass". In the new PM analyses and transformations are separate concepts. The term "pass" doesn't help because it conflates the two (and many uses in the code use "pass" to really mean "transformation", so "analysis pass" is particularly confusing and old-PM'ish).
> > > 
> > > Hopefully some day we can rename things to be more consistent. Really the new "PM" has just two main things: an `AnalysisCache` class and a bunch of composable `TransformationRunner`'s. There isn't a conflated concept of "pass" (which can be either a transformation or an analysis) like in the old PM.
> > +1 to that in principle, although if we actually carry that out, we're going to have to do a big refactoring, so until then my personal preference would be that we should just say whatever is clear, rather than adding in the new terminology in places where "pass" would, in the current state, be more clear.
> > 
> > Not begging the specific question of what to say here.
> I don't actually think of it this way.
> 
> I think there is a common underlying idea of a pass, and there are two primary special cases: analysis passes and transformation passes.
> 
> I understand that many (most?) analysis passes tend to be trivial and we instead focus on the analysis result and caching it, but I don't want to neglect the fact that it is a pass that gets run over the IR. In that sense, the `AnalysisManager` *is* a `pass manager as well.
> 
> Anyways, I'm happy to spend some time debating this long term, but I'm not sure it's the right focus of this code review....
>  I'm not sure it's the right focus of this code review....

I tend to lose state on anything I'm not working on in a week.  If you start the discussion in the forum of your choosing Monday, great.  If you start a thread in two or three weeks, you will probably still have state, but I may not, and then I will either decide I no longer care, or have to come back and page all this back in.  Either way is not fun.

In fact I now vaguely recall that we had an outstanding question from a previous patch that we said we'd discuss outside the review.  Maybe we did come back and it's been resolved?  I am sort of a goldfish.

Because I've now been thinking about it, let me just say what I have in mind so we can capture it somewhere.  I hope that's OK.

> the AnalysisManager *is* a `pass manager as well.

I would ask a slightly different question.  Instead of "is AM technically a PM?", I'd ask, "is it *useful* to an engineer of average skill to think of AM as a PM, and to think of Analyses as Passes?"

One can even have comments on AM explaining this technicality if it's useful to understand when thinking about the PM/AM framework.  But then, is this happenstance of abstraction *so fundamental* that we should also use it everywhere else?  Maybe I haven't fully understood the code because I don't grok that an analysis is just a monoid in the category of endofunctors^W^W^W^W^W^Wpass.  :)

Personally I think "pass" is a useful word because "optimization pass" was a term I knew before I started working on compilers.  But like Sean I am not sure it helps me more than it hurts to think of analyses as the same sort of thing.

https://reviews.llvm.org/D27198