[PATCH] D24226: [PM] Provide an initial, minimal port of the inliner to the new pass manager.

Fri Dec 9 04:00:07 PST 2016

chandlerc added a comment.

With the latest update I've now addressed essentially all of the comments (that remained relevant). The patch is now, if anything, smaller and more focused too. =] Some responses to specific comments below.

================
Comment at: include/llvm/Analysis/CGSCCPassManager.h:527
+std::pair<LazyCallGraph::RefSCC *, LazyCallGraph::SCC *>
+updateCGAndAnalysisManagerForDeadEdge(
+    LazyCallGraph &G, LazyCallGraph::RefSCC *RC, LazyCallGraph::SCC *C,
----------------
sanjoy wrote:
> Please document the `RC` and `C` parameters-- it isn't obvious what they represent.  Are they `SourceRC` and `SourceC`?
> 
> Also, the return value needs to be documented.
This code is gone now.

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:140-141
+    CGSCCAnalysisManager &AM, CGSCCUpdateResult &UR, bool DebugLogging) {
+  typedef LazyCallGraph::SCC SCC;
+  typedef LazyCallGraph::RefSCC RefSCC;
+
----------------
Prazek wrote:
> why not using?
Consistency with other code, but all of this is gone now.

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:163
+  if (!NewRefSCCs.empty()) {
+    // Note that we don't bother to invalidate analyses as ref-edge
+    // connectivity is not really observable in any way and is intended
----------------
sanjoy wrote:
> This should be mentioned on the declaration of `RefSCC` itself.
I'll fold this into the larger LCG documentation update ongoing. Also note that this existing comment just moved around (and is no longer moved after the updates to this patch).

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:182
+    assert(G.lookupRefSCC(SourceN) == RC && "Failed to update current RefSCC!");
+    for (RefSCC *NewRC : reverse(NewRefSCCs))
+      if (NewRC != RC) {
----------------
sanjoy wrote:
> I'd just assert that the first element of `NewRefSCCs` is `RC` and then skip the first element in the loop.
> 
> Checking `NewRC != RC` in every iteration may let a bug slip through undetected.
Done but in a separate cleanup CL.

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:269
       for (Value *Op : I.operand_values())
         if (Constant *C = dyn_cast<Constant>(Op))
           if (Visited.insert(C).second)
----------------
Prazek wrote:
> I know that it is not your change, but auto
I will do a separate cleanup CL to add some use of auto in relevant places, but this doesn't belong in this change.

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:393
+
+  if (N.begin() == N.end())
+    // Can't remove edges that don't exist!
----------------
sanjoy wrote:
> Add a `Node::empty()`?
This code is gone so I've not added this. I can though if it comes back up.

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:421
+      if (Function *Callee = CS.getCalledFunction())
+        if (Visited.insert(Callee).second && !Callee->isDeclaration())
+          // Actually do the work here of erasing the callee from the dead
----------------
sanjoy wrote:
> What is the extra check on `Visited.insert(Callee).second` getting you here?  Why not just try to `DeadCallTargets.erase(Callee)` directly?
All this code is gone now.

Happy to revisit the walk approach in the other code though to either improve it or to change it to look like this walk looked. But not in the inliner patch. =]

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:429
+
+  // Now walk all references. We do this incrementally and with a fresh walk of
+  // instructions to allow us to end both of these walks early if we run out of
----------------
sanjoy wrote:
> I don't buy this reason for doing two walks.  :)
> 
> I think you should be able to do a combined walk and break out if `DeadRefTargets.empty() && DeadCallTargets.empty()`, and predicate the `DeadCallTargets` on `!DeadCallTargets.empty()` and corresponding for `DeadRefTargets`.  We'll have the overhead of repeated calls to `SmallPtrSet::empty()` but that sounds cheaper than walking all instructions twice.
> 
> However, walking twice looks simpler.  If you'd rather keep it this way to make things readable, I can buy that. :)
See above, all this is gone now. We can talk about whether the existing code should be simplified in a separate patch.

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:432
+  // potentially dead edges.
+  for (Instruction &I : instructions(F)) {
+    for (Value *Op : I.operand_values())
----------------
Prazek wrote:
> auto?
I think the type helps readability here.

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:433
+  for (Instruction &I : instructions(F)) {
+    for (Value *Op : I.operand_values())
+      if (Constant *C = dyn_cast<Constant>(Op))
----------------
Prazek wrote:
> maybe auto
And here.

================
Comment at: lib/Analysis/CGSCCPassManager.cpp:434
+    for (Value *Op : I.operand_values())
+      if (Constant *C = dyn_cast<Constant>(Op))
+        if (Visited.insert(C).second)
----------------
Prazek wrote:
> auto
But will upt auto here in some follow-up patc.h

================
Comment at: lib/Analysis/LazyCallGraph.cpp:28
                     LazyCallGraph::Edge::Kind EK) {
-  // Note that we consider *any* function with a definition to be a viable
-  // edge. Even if the function's definition is subject to replacement by
-  // some other module (say, a weak definition) there may still be
-  // optimizations which essentially speculate based on the definition and
-  // a way to check that the specific definition is in fact the one being
-  // used. For example, this could be done by moving the weak definition to
-  // a strong (internal) definition and making the weak definition be an
-  // alias. Then a test of the address of the weak function against the new
-  // strong definition's address would be an effective way to determine the
-  // safety of optimizing a direct call edge.
-  if (!F.isDeclaration() &&
-      EdgeIndexMap.insert({&F, Edges.size()}).second) {
-    DEBUG(dbgs() << "    Added callable function: " << F.getName() << "\n");
-    Edges.emplace_back(LazyCallGraph::Edge(F, EK));
-  }
+  if (!EdgeIndexMap.insert({&F, Edges.size()}).second)
+    return;
----------------
Prazek wrote:
> emplace/try_emplace?
The insert was already there. If there is some value to emplace (not sure there is) then we can go through and systematically use it, but I'd rather not blindly mix and match as code happens to be touched.

================
Comment at: lib/Transforms/IPO/Inliner.cpp:842
+      // instruction iterator and backing up one.
+      auto LastBBI = std::prev(F.end());
+
----------------
davidxl wrote:
> Where is this behavior documented? It seems very fragile to depend on implication details here -- there is not even a way to do assert.
It's not really documented at all, and it only actually holds for the *clone* API, not the InlineFunction API. See the discussion weath Easwaran below for details, but I'm no longer dealing with any of this.

================
Comment at: lib/Transforms/IPO/Inliner.cpp:848
+
+      // Now walk the basic blocks and instructions that were inlined and add
+      // any new calls to the worklist.
----------------
sanjoy wrote:
> eraman wrote:
> > If only one block is cloned into the caller, it is spliced into the block containing the call instruction. This logic to walk the newly added basic blocks does not work in that case.
> Looks like this one wasn't addressed.
This is a great comment Easwaran, and in fact there are more issues. I was thinking of the clone API when I wrote this, InlineFunction does *way* too much to let this work.

There were actually other issues here as well, and I've switched entirely to a simpler approach where we add all of the edges in the call graph for all of the inlined functions, and then prune out the ones based on simplifications at the end of this. For iterating in the inliner itself, I've added a tiny bit of code *inside* `InlineFunction` where we were already walking the cloned instructions to compute the list of inlined calls even without a call graph, and used that here. Hopefully this works better -- I even already had a test case that was hitting this and just hadn't looked at it carefully enough. 

================
Comment at: lib/Transforms/IPO/Inliner.cpp:851
+      RefVisited.clear();
+      for (BasicBlock &BB : make_range(std::next(LastBBI), F.end()))
+        for (Instruction &I : BB) {
----------------
davidxl wrote:
> It is quite unfortunate IR will need to be traversed again here -- there is compile time overhead here. Why can't IFI be used to get the newly exposed callsite?
I've actually switched to getting the newly exposed callsite inside `InlineFunction` in the latest version (see above for why). It still walks the newly inlined instructions, but there really isn't any other way and there shouldn't be much cost here (we walk the inlined instructions several times inside `InlineFunction` from what I can see).

================
Comment at: lib/Transforms/IPO/Inliner.cpp:861
+                // into the graph. This will even skip already present edges.
+                RC->insertTrivialCallEdge(N, *CG.lookup(*NewCallee));
+              }
----------------
sanjoy wrote:
> As mentioned in `D24226`, `InlineFunction` can insert non-trivial call edges (by promoting ref edges), causing SCCs to be formed.
Yep. I've added the test cases for this and it is now handled much better.

================
Comment at: lib/Transforms/IPO/Inliner.cpp:863
+              }
+          for (Value *Op : I.operand_values())
+            if (Constant *C = dyn_cast<Constant>(Op))
----------------
davidxl wrote:
> If a CGSCC pass has to go through hoops to get SCC graph update properly, I feel that something is wrong. Having multiple levels of SCC is one thing whose complexity can probably be hidden,  but requiring every pass to pay the penalty is a different story.
I'm not sure what the concern is here? This is even simpler now, but either way I think the Inliner is somewhat unusual in that it is *mutating* the call graph. Passes which do that will have to do work to correctly update SCCs.

================
Comment at: lib/Transforms/IPO/Inliner.cpp:894
+    if (!DidInline) {
+      Changed = true;
+      continue;
----------------
eraman wrote:
> Should this be
> 
> if (!DidInline)
>   continue;
> Changed = true;
> 
> ?
Uh, quite. =D Fixed!

https://reviews.llvm.org/D24226