patch: reimplement hasPath, also split BasicBlockUtils into transforms and analysis

Mon Jun 17 16:47:51 PDT 2013

On 13 June 2013 01:15, Chandler Carruth <chandlerc at google.com> wrote:

> On Mon, Jun 10, 2013 at 5:24 PM, Nick Lewycky <nlewycky at google.com> wrote:
>
>> This patch reimplements the "hasPath" utility function currently living
>> in AliasAnalysis.cpp. There's some discussion of this algorithm and some
>> context back in this thread from December last year:
>>
>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121217/159680.html
>>
>> I also refactor it into a new home in lib/Analysis, called CFG.cpp. This
>> is created by breaking apart lib/Transforms/Utils/BasicBlockUtils.cpp and
>> moving everything which is there but actually only does analysis into
>> CFG.cpp.
>>
>> Please review!
>>
>
> Mind putting the next version into Phab? Much easier to review there.
>

http://llvm-reviews.chandlerc.com/D996 . This is just the same patch I sent
out previously, no changes yet.

> +/// FindFunctionBackedges - Analyze the specified function to find all of
> the
> +/// loop backedges in the function and return them.  This is a relatively
> cheap
> +/// (compared to computing dominators and loop info) analysis.
>
> Throughout this patch, please switch to the modern form of Doxygen
> comments.
>

This is copied verbatim from BasicBlockUtils, but done anyways.

 +///
> +/// The output is added to Result, as pairs of <from,to> edge info.
> +void FindFunctionBackedges(const Function &F,
> +      SmallVectorImpl<std::pair<const BasicBlock*,const BasicBlock*> >
> &Result);
>
> Do you promise to have append semantics? If so, is that useful?
>
> If not, why not return? Is this just hiding the N of the small vector, and
> is that a good enoguh reason to use an output parameter?
>
> Also, throughout this patch please switch to modern naming conventions for
> functions.
>
> Should this really be a public interface at all? Are there users other
> than hasPath?
>

This is copied verbatim from BasicBlockUtils. hasPath doesn't even use it.
It looks like the only caller is JumpThreading today.

 +
> +/// GetSuccessorNumber - Search for the specified successor of basic
> block BB
> +/// and return its position in the terminator instruction's list of
> +/// successors.  It is an error to call this with a block that is not a
> +/// successor.
> +unsigned GetSuccessorNumber(BasicBlock *BB, BasicBlock *Succ);
>
> This seems like it should stay in BBUtils... It has little to do with the
> CFG itself. Personally, I don't think this should be in *any* public
> interface unless we intend to put it into the TerminatorInst's public
> interface or BasicBlock's public interface.
>

I was thinking the same thing -- this belongs in BasicBlock's public
interface. If you want, I can commit that move ahead of this patch?

I want this patch to be smaller. The only things I want in this patch are
a) new implementation of hasPath b) hasPath lives somewhere publicly
accessible. There wasn't a good place for hasPath to live, so I had to
create one, and for that to make sense, I had to move things into it.
Putting hasPath in BasicBlockUtils doesn't work because Analysis can't
depend on Transforms.

+
> +/// isCriticalEdge - Return true if the specified edge is a critical edge.
> +/// Critical edges are edges from a block with multiple successors to a
> block
> +/// with multiple predecessors.
> +///
> +bool isCriticalEdge(const TerminatorInst *TI, unsigned SuccNum,
> +                    bool AllowIdenticalEdges = false);
>
> Why not accept a BasicBlock* for the successor? Why not accept a pair of
> BasicBlock*'s much like you store in the result vector for finding
> backedges?
>

Again this is copied verbatim from BBUtils but to answer your question,
Terminator + successor-number identifies a single edge while
pair<BasicBlock*, BasicBlock*> identifies a set of edges between two
blocks. As it happens, this form is more convenient for the callers, all of
whom are deciding whether to break critical edges (except one using it in
an assert, but it also has a TI+SuccNum handy).

 +
> +/// Determine whether there is a path from From to To within a single
> function.
> +/// Returns false only if we can prove that once 'From' has been executed
> then
> +/// 'To' can not be executed. Conservatively returns true.
> +bool hasPath(const Instruction *From, const Instruction *To,
> +             DominatorTree *DT = 0, LoopInfo *LI = 0);
>
> The comment and the name 'hasPath' don't seem to match? I read the comment
> as meaning 'mayHavePath' or 'isPotentiallyReachable' or some such. Maybe
> you're trying to say that the definition of 'path' is exactly what we will
> be able to statically prove? This isn't entirely clear to me.
>

Thanks, 'isPotentiallyReachable' is a much better fit. "If we're at
instruction From, could we reach instruction To?" with a conservative
'true' answer.

 +namespace {
> +
> +// This fixture assists in running the hasPath utility four ways and
> ensuring
> +// it's correct all ways
> +class HasPathTest : public testing::Test {
> +protected:
> +  void ParseAssembly(StringRef Assembly) {
> +    M.reset(new Module("Module", getGlobalContext()));
> +
> +    SMDiagnostic Error;
> +    bool Parsed = ParseAssemblyString(Assembly.str().c_str(), M.get(),
> Error,
> +                                      M->getContext()) == M.get();
> +    std::string errMsg;
> +    raw_string_ostream os(errMsg);
> +    Error.print("", os);
> +    // Assert, because a bug here means that the test itself is buggy.
> +    ASSERT_TRUE(Parsed) << os.str();
> +
> +    Function *F = M->getFunction("test");
> +    ASSERT_TRUE(F != NULL) << "Test must have a function named @test: "
> << M;
> +
> +    A = B = NULL;
> +    for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I) {
> +      if (I->hasName()) {
> +        if (I->getName() == "A")
> +          A = &*I;
> +        else if (I->getName() == "B")
> +          B = &*I;
> +      }
> +    }
> +    ASSERT_TRUE(A != NULL) << "@test must have an instruction %A: " << *F;
> +    ASSERT_TRUE(B != NULL) << "@test must have an instruction %B: " << *F;
> +  }
> +
> +  void ExpectPath(bool ExpectedResult) {
> +    static char ID;
> +    class HasPathTestPass : public FunctionPass {
> +     public:
> +      HasPathTestPass(bool ExpectedResult, Instruction *A, Instruction *B)
> +          : FunctionPass(ID), ExpectedResult(ExpectedResult), A(A), B(B)
> {}
> +
> +      void getAnalysisUsage(AnalysisUsage &AU) const {
> +        AU.setPreservesAll();
> +        AU.addRequired<LoopInfo>();
> +        AU.addRequired<DominatorTree>();
> +      }
> +
> +      bool runOnFunction(Function &F) {
> +        if (!F.hasName() || F.getName() != "test")
> +          return false;
> +
> +        LoopInfo *LI = &getAnalysis<LoopInfo>();
> +        DominatorTree *DT = &getAnalysis<DominatorTree>();
> +        EXPECT_EQ(hasPath(A, B, 0, 0), ExpectedResult);
> +        EXPECT_EQ(hasPath(A, B, DT, 0), ExpectedResult);
> +        EXPECT_EQ(hasPath(A, B, 0, LI), ExpectedResult);
> +        EXPECT_EQ(hasPath(A, B, DT, LI), ExpectedResult);
> +        return false;
> +      }
> +      bool ExpectedResult;
> +      Instruction *A, *B;
> +    };
> +
> +    HasPathTestPass *P = new HasPathTestPass(ExpectedResult, A, B);
> +    PassManager PM;
> +    PM.add(P);
> +    PM.run(*M);
> +  }
> +private:
> +  OwningPtr<Module> M;
> +  Instruction *A, *B;
> +};
> +
> +}
> +
> +TEST_F(HasPathTest, SameBlockNoPath) {
> +  ParseAssembly(
> +      "define void @test() {\n"
> +      "entry:\n"
> +      "  bitcast i8 undef to i8\n"
> +      "  %B = bitcast i8 undef to i8\n"
> +      "  bitcast i8 undef to i8\n"
> +      "  bitcast i8 undef to i8\n"
> +      "  %A = bitcast i8 undef to i8\n"
> +      "  ret void\n"
> +      "}\n");
> +  ExpectPath(false);
> +}
>
> I'm not a huge fan of this structure of the testing machinery.
>
> I would probably do it slightly differently:
>
> Have a method in the fixture which builds a dummy pass and passmanager
> that does nothing other than take a module member and extract the DomTree
> and other analysis passes into other fixture members. Then you could build
> up the IR in the module using an IRBuilder (and/or some variant of
> ParseAssembly, although ParseAssembly should almost certainly be hoisted
> into a generic IR unit testing helper library), run the pass manager to
> extract analysis passes, and then have each test directly state its
> EXPECT_...(...) lines against hasPath.
>
> Does that make sense?
>

Sadly, it isn't safe in general to extract a Pass and use it outside the
PassManager. For example, it's permitted to free internal data structures
upon doFinalization. I really do want the EXPECT_... calls in each test
function, and I couldn't figure out how to do it.

ParseAssembly absolutely deserves to be in a testing library, as it is I'm
replicating logic from unittests/ExecutionEngine/JIT/{Multi,}JITTest.cpp
and IR/DominatorTreeTest.cpp. Of course, we don't have one. There are
already three users of essentially the same functionality, let me see if I
can do something about that now.

Do you have any comments about the implementation of hasPath itself? It's
reasonably complex.

Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130617/58074750/attachment.html>