patch: reimplement hasPath, also split BasicBlockUtils into transforms and analysis

Thu Jun 13 01:15:32 PDT 2013

On Mon, Jun 10, 2013 at 5:24 PM, Nick Lewycky <nlewycky at google.com> wrote:

> This patch reimplements the "hasPath" utility function currently living in
> AliasAnalysis.cpp. There's some discussion of this algorithm and some
> context back in this thread from December last year:
>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121217/159680.html
>
> I also refactor it into a new home in lib/Analysis, called CFG.cpp. This
> is created by breaking apart lib/Transforms/Utils/BasicBlockUtils.cpp and
> moving everything which is there but actually only does analysis into
> CFG.cpp.
>
> Please review!
>

Mind putting the next version into Phab? Much easier to review there.

+/// FindFunctionBackedges - Analyze the specified function to find all of
the
+/// loop backedges in the function and return them.  This is a relatively
cheap
+/// (compared to computing dominators and loop info) analysis.

Throughout this patch, please switch to the modern form of Doxygen comments.

+///
+/// The output is added to Result, as pairs of <from,to> edge info.
+void FindFunctionBackedges(const Function &F,
+      SmallVectorImpl<std::pair<const BasicBlock*,const BasicBlock*> >
&Result);

Do you promise to have append semantics? If so, is that useful?

If not, why not return? Is this just hiding the N of the small vector, and
is that a good enoguh reason to use an output parameter?

Also, throughout this patch please switch to modern naming conventions for
functions.

Should this really be a public interface at all? Are there users other than
hasPath?

+
+/// GetSuccessorNumber - Search for the specified successor of basic block
BB
+/// and return its position in the terminator instruction's list of
+/// successors.  It is an error to call this with a block that is not a
+/// successor.
+unsigned GetSuccessorNumber(BasicBlock *BB, BasicBlock *Succ);

This seems like it should stay in BBUtils... It has little to do with the
CFG itself. Personally, I don't think this should be in *any* public
interface unless we intend to put it into the TerminatorInst's public
interface or BasicBlock's public interface.

+
+/// isCriticalEdge - Return true if the specified edge is a critical edge.
+/// Critical edges are edges from a block with multiple successors to a
block
+/// with multiple predecessors.
+///
+bool isCriticalEdge(const TerminatorInst *TI, unsigned SuccNum,
+                    bool AllowIdenticalEdges = false);

Why not accept a BasicBlock* for the successor? Why not accept a pair of
BasicBlock*'s much like you store in the result vector for finding
backedges?

+
+/// Determine whether there is a path from From to To within a single
function.
+/// Returns false only if we can prove that once 'From' has been executed
then
+/// 'To' can not be executed. Conservatively returns true.
+bool hasPath(const Instruction *From, const Instruction *To,
+             DominatorTree *DT = 0, LoopInfo *LI = 0);

The comment and the name 'hasPath' don't seem to match? I read the comment
as meaning 'mayHavePath' or 'isPotentiallyReachable' or some such. Maybe
you're trying to say that the definition of 'path' is exactly what we will
be able to statically prove? This isn't entirely clear to me.

+namespace {
+
+// This fixture assists in running the hasPath utility four ways and
ensuring
+// it's correct all ways
+class HasPathTest : public testing::Test {
+protected:
+  void ParseAssembly(StringRef Assembly) {
+    M.reset(new Module("Module", getGlobalContext()));
+
+    SMDiagnostic Error;
+    bool Parsed = ParseAssemblyString(Assembly.str().c_str(), M.get(),
Error,
+                                      M->getContext()) == M.get();
+    std::string errMsg;
+    raw_string_ostream os(errMsg);
+    Error.print("", os);
+    // Assert, because a bug here means that the test itself is buggy.
+    ASSERT_TRUE(Parsed) << os.str();
+
+    Function *F = M->getFunction("test");
+    ASSERT_TRUE(F != NULL) << "Test must have a function named @test: " <<
M;
+
+    A = B = NULL;
+    for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I) {
+      if (I->hasName()) {
+        if (I->getName() == "A")
+          A = &*I;
+        else if (I->getName() == "B")
+          B = &*I;
+      }
+    }
+    ASSERT_TRUE(A != NULL) << "@test must have an instruction %A: " << *F;
+    ASSERT_TRUE(B != NULL) << "@test must have an instruction %B: " << *F;
+  }
+
+  void ExpectPath(bool ExpectedResult) {
+    static char ID;
+    class HasPathTestPass : public FunctionPass {
+     public:
+      HasPathTestPass(bool ExpectedResult, Instruction *A, Instruction *B)
+          : FunctionPass(ID), ExpectedResult(ExpectedResult), A(A), B(B) {}
+
+      void getAnalysisUsage(AnalysisUsage &AU) const {
+        AU.setPreservesAll();
+        AU.addRequired<LoopInfo>();
+        AU.addRequired<DominatorTree>();
+      }
+
+      bool runOnFunction(Function &F) {
+        if (!F.hasName() || F.getName() != "test")
+          return false;
+
+        LoopInfo *LI = &getAnalysis<LoopInfo>();
+        DominatorTree *DT = &getAnalysis<DominatorTree>();
+        EXPECT_EQ(hasPath(A, B, 0, 0), ExpectedResult);
+        EXPECT_EQ(hasPath(A, B, DT, 0), ExpectedResult);
+        EXPECT_EQ(hasPath(A, B, 0, LI), ExpectedResult);
+        EXPECT_EQ(hasPath(A, B, DT, LI), ExpectedResult);
+        return false;
+      }
+      bool ExpectedResult;
+      Instruction *A, *B;
+    };
+
+    HasPathTestPass *P = new HasPathTestPass(ExpectedResult, A, B);
+    PassManager PM;
+    PM.add(P);
+    PM.run(*M);
+  }
+private:
+  OwningPtr<Module> M;
+  Instruction *A, *B;
+};
+
+}
+
+TEST_F(HasPathTest, SameBlockNoPath) {
+  ParseAssembly(
+      "define void @test() {\n"
+      "entry:\n"
+      "  bitcast i8 undef to i8\n"
+      "  %B = bitcast i8 undef to i8\n"
+      "  bitcast i8 undef to i8\n"
+      "  bitcast i8 undef to i8\n"
+      "  %A = bitcast i8 undef to i8\n"
+      "  ret void\n"
+      "}\n");
+  ExpectPath(false);
+}

I'm not a huge fan of this structure of the testing machinery.

I would probably do it slightly differently:

Have a method in the fixture which builds a dummy pass and passmanager that
does nothing other than take a module member and extract the DomTree and
other analysis passes into other fixture members. Then you could build up
the IR in the module using an IRBuilder (and/or some variant of
ParseAssembly, although ParseAssembly should almost certainly be hoisted
into a generic IR unit testing helper library), run the pass manager to
extract analysis passes, and then have each test directly state its
EXPECT_...(...) lines against hasPath.

Does that make sense?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130613/737ab4df/attachment.html>