[llvm-commits] [PATCH] Fix a miscompile in memory dependency analysis
Nick Lewycky
nicholas at mxc.ca
Wed Dec 19 01:35:39 PST 2012
Manman Ren wrote:
>
> Given input:
> while.body:
> call void @RunInMode(i32 100) nounwind
> %1 = load i32* %tasksIdle, align 4
> %tobool = icmp eq i32 %1, 0
> br i1 %tobool, label %while.cond.backedge, label %if.then
>
> if.then:
> store i32 0, i32* %tasksIdle, align 4
> call void @TimerCreate(i32* %shouldExit) nounwind
> br label %while.cond.backedge
>
> while.cond.backedge:
> %2 = load i32* %shouldExit, align 4
> %cmp = icmp eq i32 %2, 0
> br i1 %cmp, label %while.body, label %while.cond.while.end_crit_edge
>
> Make sure we are not returning "call RunInMode" does not modify %shouldExit.
> Currently, we optimize this into
> while.cond.backedge: ; preds = %if.then, %while.body
> %1 = phi i32 [ %.pre, %if.then ], [ 0, %while.body ]
> %cmp = icmp eq i32 %1, 0
> br i1 %cmp, label %while.body, label %while.cond.while.end_crit_edge
>
> The proper fix is to check all uses of %shouldExit as long as there is a possible
> path from the use to "call RunInMode". The proposed patch uses LoopInfo to approximate the reachability analysis.
> Suggestions on how to approximate this are welcome.
>
> Please review, Thanks,
Thanks for working on this.
I think you should factor out logic for "is Inst2 reachable from Inst1"
and put it somewhere reusable (BasicBlockUtils?). Making it take
DominatorTree* and LoopInfo* is the right idea, but we may not have
those available. If the method requires one of those, it should take it
as a reference instead of a pointer.
+ AU.addRequired<LoopInfo>();
Sorry, I don't think we can make MemoryDependenceAnalysis depend on
LoopInfo.
What we can do is check if it just happens to be available, due to the
pass ordering we happen to be running in. If it is,
getAnalysisIfAvailable<LoopInfo>() will return the pointer and we can
use it to speed up the search, or NULL.
I spent a while thinking about whether LoopInfo ever actually helps us
prune the search, both in the case where we have a dominator tree
available and when we don't. I decided it does (in both cases), and I
wanted to quickly jot down what I think the algorithm is:
We're searching for whether there exists any path from A to B.
- if we have loop info and not dominator trees, set B = B's loop
header (this is an optimization only).
- push A on worklist.
- pop A' off the worklist and
- if A' and B are not both in the same loop, then A' is a dead end.
Skip to the next item on the worklist.
- if A' dominates B, we have found a path. Stop.
- push all of A''s successors onto the worklist, except the ones
we've visited before.
- if the worklist is empty, there is no path. Stop.
Note that "in the same loop" can't be implemented by a simple
getLoopFor(A) == getLoopFor(B), because of nested loops. The test we
really want is "does there exist a Loop* that contains both blocks?".
If the dominator tree is not available then the test for "A' dominates
B" becomes "A' == B" with no change in precision. If the loop
information is not available, then we skip that test, again with loss of
precision. We should have some sort of visited-blocks limit to make sure
this doesn't take too much compile time.
Nick
More information about the llvm-commits
mailing list