[llvm-commits] [PATCH] Fix a miscompile in memory dependency analysis

Wed Dec 19 01:35:39 PST 2012

Manman Ren wrote:
>
> Given input:
> while.body:
>   call void @RunInMode(i32 100) nounwind
>   %1 = load i32* %tasksIdle, align 4
>   %tobool = icmp eq i32 %1, 0
>   br i1 %tobool, label %while.cond.backedge, label %if.then
>
> if.then:
>    store i32 0, i32* %tasksIdle, align 4
>    call void @TimerCreate(i32* %shouldExit) nounwind
>    br label %while.cond.backedge
>
> while.cond.backedge:
>   %2 = load i32* %shouldExit, align 4
>   %cmp = icmp eq i32 %2, 0
>   br i1 %cmp, label %while.body, label %while.cond.while.end_crit_edge
>
> Make sure we are not returning "call RunInMode" does not modify %shouldExit.
> Currently, we optimize this into
> while.cond.backedge:                              ; preds = %if.then, %while.body
>   %1 = phi i32 [ %.pre, %if.then ], [ 0, %while.body ]
>   %cmp = icmp eq i32 %1, 0
>   br i1 %cmp, label %while.body, label %while.cond.while.end_crit_edge
>
> The proper fix is to check all uses of %shouldExit as long as there is a possible
> path from the use to "call RunInMode". The proposed patch uses LoopInfo to approximate the reachability analysis.
> Suggestions on how to approximate this are welcome.
>
> Please review, Thanks,

Thanks for working on this.

I think you should factor out logic for "is Inst2 reachable from Inst1" 
and put it somewhere reusable (BasicBlockUtils?). Making it take 
DominatorTree* and LoopInfo* is the right idea, but we may not have 
those available. If the method requires one of those, it should take it 
as a reference instead of a pointer.

+  AU.addRequired<LoopInfo>();

Sorry, I don't think we can make MemoryDependenceAnalysis depend on 
LoopInfo.

What we can do is check if it just happens to be available, due to the 
pass ordering we happen to be running in. If it is, 
getAnalysisIfAvailable<LoopInfo>() will return the pointer and we can 
use it to speed up the search, or NULL.

I spent a while thinking about whether LoopInfo ever actually helps us 
prune the search, both in the case where we have a dominator tree 
available and when we don't. I decided it does (in both cases), and I 
wanted to quickly jot down what I think the algorithm is:

We're searching for whether there exists any path from A to B.
  - if we have loop info and not dominator trees, set B = B's loop 
header (this is an optimization only).
  - push A on worklist.
  - pop A' off the worklist and
    - if A' and B are not both in the same loop, then A' is a dead end. 
Skip to the next item on the worklist.
    - if A' dominates B, we have found a path. Stop.
    - push all of A''s successors onto the worklist, except the ones 
we've visited before.
  - if the worklist is empty, there is no path. Stop.

Note that "in the same loop" can't be implemented by a simple 
getLoopFor(A) == getLoopFor(B), because of nested loops. The test we 
really want is "does there exist a Loop* that contains both blocks?".

If the dominator tree is not available then the test for "A' dominates 
B" becomes "A' == B" with no change in precision. If the loop 
information is not available, then we skip that test, again with loss of 
precision. We should have some sort of visited-blocks limit to make sure 
this doesn't take too much compile time.

Nick