[PATCH] hasPath

Chandler Carruth chandlerc at gmail.com
Wed Jul 24 17:55:21 PDT 2013


  Reviewed it pretty thoroughly now from the API side and the core analysis logic side. Sorry for the nit picks, just commented as stuff caught my eye. The substantive comments are there as well. Anyways, detailed comments inline.

  One set of test cases I'd like to see are cases where the tricks to avoid successor walks allow us to scale up better. IE, lots of tiny blocks that we skip w/ loop info (or maybe domtree).

  The comments are a mixture of "please fix this" and questions or ideas for future stuff that I think would be good to do. Once the first group are addressed, I'm happy for you to commit whenever and we can continue the discussion here or in whatever forum works best. I'm not worried about seeing the fixes for the things that are obvious, and the things that aren't obvious seem likely to end up in their own separate patch anyways. So, with changes requested, LGTM.


================
Comment at: include/llvm/Analysis/CFG.h:36
@@ +35,3 @@
+void FindFunctionBackedges(const Function &F,
+      SmallVectorImpl<std::pair<const BasicBlock*,const BasicBlock*> > &Result);
+
----------------
Want to reformat the code you're moving in this patch? Your call, I know this is a move.

================
Comment at: include/llvm/Analysis/CFG.h:52
@@ +51,3 @@
+/// Returns false only if we can prove that once 'From' has been executed then
+/// 'To' can not be executed. Conservatively returns true.
+bool isPotentiallyReachable(const Instruction *From, const Instruction *To,
----------------
I think it would be useful to give some guidance about the performance (big-O complexity based) that users should expect to help guide callers.

I would also add a brief one-line comment.

================
Comment at: include/llvm/Analysis/CFG.h:53
@@ +52,3 @@
+/// 'To' can not be executed. Conservatively returns true.
+bool isPotentiallyReachable(const Instruction *From, const Instruction *To,
+                            DominatorTree *DT = 0, LoopInfo *LI = 0);
----------------
Would it be useful to include a convenience overload that works in terms of BBs?

Also, why not references to instructions? Can these usefully be null?

================
Comment at: lib/Analysis/CFG.cpp:117-126
@@ +116,12 @@
+
+// LoopInfo contains a mapping from basic block to the innermost loop. Find
+// the outermost loop in the loop nest that contains BB.
+static const Loop *getOutermostLoop(LoopInfo *LI, const BasicBlock *BB) {
+  const Loop *L = LI->getLoopFor(BB);
+  if (L) {
+    while (const Loop *Parent = L->getParentLoop())
+      L = Parent;
+  }
+  return L;
+}
+
----------------
Not for this change, but I would like to see this added to LoopInfo's interface so others can find it rather than re-inventing it.

================
Comment at: lib/Analysis/CFG.cpp:128-134
@@ +127,9 @@
+
+// True if there is a loop which contains both BB1 and BB2.
+static bool loopContainsBoth(LoopInfo *LI,
+                             const BasicBlock *BB1, const BasicBlock *BB2) {
+  const Loop *L1 = getOutermostLoop(LI, BB1);
+  const Loop *L2 = getOutermostLoop(LI, BB2);
+  return L1 != NULL && L1 == L2;
+}
+
----------------
This also seems useful in LoopInfo. I would also potentially phrase it there in a more general form as 'loopContains(...)' which accepts 1, 2, or maybe more basic block (or instruction) arguments... But maybe this is the only case that comes up.

================
Comment at: lib/Analysis/CFG.cpp:136-138
@@ +135,5 @@
+
+/// Determine whether there is a path from From to To within a single function.
+/// Returns false only if we can prove that once 'From' has been executed then
+/// 'To' can not be executed. Conservatively returns true.
+bool llvm::isPotentiallyReachable(const Instruction *A, const Instruction *B,
----------------
Don't duplicate the doxygen comment...

================
Comment at: lib/Analysis/CFG.cpp:145-148
@@ +144,6 @@
+  if (A->getParent() == B->getParent()) {
+    // The same block case is special because it's the only time we're looking
+    // within a single block to see which comes first. Once we start looking at
+    // multiple blocks, the first instruction of the block is reachable, so we
+    // only need to determine reachability between whole blocks.
+
----------------
I would hoist this into a static helper function which can then carry this comment.

================
Comment at: lib/Analysis/CFG.cpp:151-159
@@ +150,11 @@
+    const BasicBlock *BB = A->getParent();
+    // Can't be in a loop if it's the entry block -- the entry block may not
+    // have predecessors.
+    bool HasLoop = BB != &BB->getParent()->getEntryBlock();
+
+    // Can't be in a loop if LoopInfo doesn't know about it.
+    if (LI && HasLoop) {
+      HasLoop = LI->getLoopFor(BB) != 0;
+    }
+    if (HasLoop) return true;
+
----------------
I think I would rewrite this whole thing as this to make it a bit more clear what's going on... Not sure how you feel about it though.

    // We can't handle loops. As a consequence, we can only reason about the entry block
    // without loop info.
    if ((LI && LI->getLoopFor(BB) != 0) ||
        (!LI && BB != &BB->getParent()->getEntryBlock())
      return true;



================
Comment at: lib/Analysis/CFG.cpp:161-163
@@ +160,5 @@
+
+    // Linear scan, start at 'A', see whether we hit 'B' or the end first.
+    for (BasicBlock::const_iterator BBI = A, BBE = BB->end(); BBI != BBE; ++BBI)
+      if (&*BBI == B) return true;
+    return false;
----------------
mem2reg keeps a cache of instruction indices within a block to make repeated queries for the same basic block fast -- do you envision this being useful for callers of this API?

================
Comment at: lib/Analysis/CFG.cpp:172
@@ +171,3 @@
+
+  unsigned Limit = 32;
+  SmallSet<const BasicBlock*, 64> EverQueued;
----------------
Comment why the limit is 32?

Make this a constant?

Make this a command line flag for debugging?

================
Comment at: lib/Analysis/CFG.cpp:179-182
@@ +178,6 @@
+
+  // When the stop block is unreachable, it's dominated from everywhere,
+  // regardless of whether there's a path between the two blocks.
+  if (DT && !DT->isReachableFromEntry(StopBB))
+    DT = NULL;
+
----------------
Do this before setting up the worklist. It makes more sense with the other special cases. Also, s/NULL/0/

================
Comment at: lib/Analysis/CFG.cpp:200
@@ +199,3 @@
+      SmallVector<BasicBlock*, 32> Exits;
+      Outer->getExitBlocks(Exits);
+      for (unsigned i = 0, e = Exits.size(); i != e; ++i) {
----------------
Does this have append semantics? If not, give it append semantics?

Then you can append onto your worklist and do the equivalent of std::remove to skip the visited blocks.

If not, at least hoist this vector out of the loop and use clear to reset it. I suspect it can also benefit from a smaller base size than '32'.

================
Comment at: lib/Analysis/CFG.cpp:173-174
@@ +172,4 @@
+  unsigned Limit = 32;
+  SmallSet<const BasicBlock*, 64> EverQueued;
+  SmallVector<const BasicBlock*, 32> WorkList;
+  WorkList.push_back(A->getParent());
----------------
I much prefer 'Visited' to 'EverQueued'.

I also find it more conventional and easier to read 'Worklist' as a single word.

================
Comment at: lib/Analysis/CFG.cpp:205-211
@@ +204,9 @@
+      }
+    } else {
+      for (succ_const_iterator I = succ_begin(BB), E = succ_end(BB); I != E;
+           ++I) {
+        if (EverQueued.insert(*I))
+          WorkList.push_back(*I);
+      }
+    }
+  } while (!WorkList.empty());
----------------
Could we do a similar trick with the domtree as the loopinfo?

Specifically, could we immediately sink to the leaves of the blocks domtree? That would allow us to skip over arbitrarily long chains of basic blocks rather than walking each one.

Clearly, adding this could go in a follow-up patch.

================
Comment at: lib/Analysis/CFG.cpp:193
@@ +192,3 @@
+
+    if (!--Limit) return true;
+
----------------
Comment clearly that this is the conservative exit from the loop in the case where we don't know anything.

Also, I would put the return on its own line for consistency.

================
Comment at: lib/Analysis/CFG.cpp:213
@@ +212,3 @@
+  } while (!WorkList.empty());
+  return false;
+}
----------------
Comment that the reason we  say 'false' here is that we have exhausted the possible paths, and thus know that it is in fact unreachable.


http://llvm-reviews.chandlerc.com/D996



More information about the llvm-commits mailing list