[PATCH] D21921: [proof of concept] Port old PM CGSCC visitation logic to new PM

Thu Jul 7 19:08:11 PDT 2016

chandlerc added a comment.

Mostly a high-level comment below on the core of this patch's approach.


================
Comment at: include/llvm/Analysis/CGSCCPassManager.h:111-123
@@ -100,18 +110,15 @@
     PreservedAnalyses PA = PreservedAnalyses::all();
-    for (LazyCallGraph::RefSCC &RC : CG.postorder_ref_sccs()) {
-      if (DebugLogging)
-        dbgs() << "Running an SCC pass across the RefSCC: " << RC << "\n";
+    std::vector<std::unique_ptr<CallGraphSCC>> SCCs;
+    while (!CGI.isAtEnd()) {
+      const std::vector<CallGraphNode *> &NodeVec = *CGI;
+      SCCs.emplace_back(llvm::make_unique<CallGraphSCC>(CG, &CGI, NodeVec));
+      // Copy the current SCC and increment past it so that the pass can hack
+      // on the SCC if it wants to without invalidating our iterator.
+      ++CGI;
+      CallGraphSCC &CurSCC = *SCCs.back();
+      for (int i = 0; i < (int)MaxCGSCCIterations; i++) {
 
-      for (LazyCallGraph::SCC &C : RC) {
-        PreservedAnalyses PassPA = Pass.run(C, CGAM);
+        PreservedAnalyses PassPA = Pass.run(CurSCC, CGAM);
 
-        // We know that the CGSCC pass couldn't have invalidated any other
-        // SCC's analyses (that's the contract of a CGSCC pass), so
-        // directly handle the CGSCC analysis manager's invalidation here. We
-        // also update the preserved set of analyses to reflect that invalidated
-        // analyses are now safe to preserve.
-        // FIXME: This isn't quite correct. We need to handle the case where the
-        // pass updated the CG, particularly some child of the current SCC, and
-        // invalidate its analyses.
-        PassPA = CGAM.invalidate(C, std::move(PassPA));
+        PassPA = CGAM.invalidate(CurSCC, std::move(PassPA));
 
----------------
This is the core of this change, and it highlights the problem I have with this design. But I want to emphasize something: I freely admit that you can fix the immediate failure modes. I'll even suggest how you could do that. I still think that it is a problematic design though, and hopefully this helps surface why.

So, the key thing here is that you are building "stable" SCCs with identity on the fly by allocating CallGraphSCC objects in the vector of unique pointers. You then use their identity to thread the analysis management layers. Notably, you use it to access the FunctionAnalysisManager via the CGSCCAnalysisManager via the ModuleAnalysis manager.

Both the lookup below of the FunctionAnalysisManager's proxy and the invalidate here use this identity.

However, if you were to run multiple CGSCC pass managers one after the other, they would build different SCCs each time. There is no way to persist this, at the end of the CGSCC pass manager run, the SCCs are going away.

To make this code correct, you should probably use the 'clear' method on the CGSCCAnalysisManager at the end of the CGSCC pass manager run. I think there are some bugs in the current pass manager code you'll hit if you do that. We can write some unit tests that hit these (sorry, I missed them when originally working on it) and fix them. Once fixed, that should make this approach correct AFAICT. However, I don't like some of the consequences.

If we fix the bugs in the 'clear' code path in the new pass manager, this will have the effect of invalidating all function analyses computed thus far (assuming it follows the current design direction of the pass manager thus far). And one of the really big goals of the new pass manager design was to reduce the re-computing of analyses through caching.

The way I see to continue to get caching for function analyses with this approach would be to avoid doing downward invalidation[1]. That is, when an SCC pass invalidates SCC analyses, don't have that trigger invalidation of function analyses. We could probably make that direction work, but it would mean manually tracking function analysis invalidation here and in the module pass manager rather than relying on the wiring between the proxies to do this for us.

So that is what I think it would look like to make the approach in this patch at least functional for the passes as they exist in the tree today.


Beyond that, as Sean has said, this doesn't really have a good mechanism to support SCC-based analyses. They will currently work, but they will be cached against an SCC that isn't being updated, and they won't survive the CGSCC pass manager run or be queryable. Notably, if we want an SCC analysis, we won't be able to query it from a module pass without duplicating the logic here to run the SCC iterator over the module and produce SCCs that we can then query the CGAM with.

Now, we have no need of that today. But I think it is a mistake to design in a different limitation like this.


I understand that your goal in this Sean is to make incremental progress. However, I am personally much more interested in figuring out the long term design for the CGSCC passes and the new pass manager. I think it will create churn to maintain all of the different permutations, as I think even with this approach we will have significant differences and challenges with the new pass manager. And so I think we will end up supporting both the legacy pass manager, and this one, *and* the one with stable SCCs that provides a more direct solution to the analysis challenges outlined above.



[1]: "downward invalidation" is something I had planned to add, but hadn't gotten to yet. It clearly would benefit from being visible in the code though, so I'm going to work on a short patch that adds it and unit tests that show how it would work. Naturally, we can always go a different direction if needed.


http://reviews.llvm.org/D21921