[llvm-commits] [PATCH] Fix a miscompile in memory dependency analysis

Thu Dec 20 08:03:08 PST 2012

Ping

On Dec 19, 2012, at 10:05 AM, Manman Ren <mren at apple.com> wrote:

> 
> Hi Nick,
> 
> Thanks for having the detailed algorithm.
> According to Chandler (and I agree), 
> 
>   We can't rely on LoopInfo to describe all backedges as LoopInfo doesn't represent unnatural loops -- essentially it isn't guaranteed to find all cycles.
> 
> I now have a simple implementation of a very conservative reachablity analysis:
> +  // Conservatively return true. Return false, if there is a single path
> +  // starting from "From" and the path does not reach "To".
> +  static bool hasPath(const BasicBlock *From, const BasicBlock *To) {
> +    const unsigned MaxCheck = 5;
> +    const BasicBlock *Current = From;
> +    for (unsigned I = 0; I < MaxCheck; I++) {
> +      unsigned NumSuccs = Current->getTerminator()->getNumSuccessors();
> +      if (NumSuccs > 1)
> +        return true;
> +      if (NumSuccs == 0)
> +        return false;
> +      Current = Current->getTerminator()->getSuccessor(0);
> +      if (Current == To)
> +        return true;
> +    }
> +    return true;
> +  }
> 
> And I will file a PR to request a more advanced reachability analysis.
> Please review the attached patch,
> 
> Thanks,
> Manman
> 
> <mem_dep_2.patch>
> 
> On Dec 19, 2012, at 2:07 AM, Nick Lewycky wrote:
> 
>> Nick Lewycky wrote:
>>> Manman Ren wrote:
>>>> 
>>>> Given input:
>>>> while.body:
>>>> call void @RunInMode(i32 100) nounwind
>>>> %1 = load i32* %tasksIdle, align 4
>>>> %tobool = icmp eq i32 %1, 0
>>>> br i1 %tobool, label %while.cond.backedge, label %if.then
>>>> 
>>>> if.then:
>>>> store i32 0, i32* %tasksIdle, align 4
>>>> call void @TimerCreate(i32* %shouldExit) nounwind
>>>> br label %while.cond.backedge
>>>> 
>>>> while.cond.backedge:
>>>> %2 = load i32* %shouldExit, align 4
>>>> %cmp = icmp eq i32 %2, 0
>>>> br i1 %cmp, label %while.body, label %while.cond.while.end_crit_edge
>>>> 
>>>> Make sure we are not returning "call RunInMode" does not modify
>>>> %shouldExit.
>>>> Currently, we optimize this into
>>>> while.cond.backedge: ; preds = %if.then, %while.body
>>>> %1 = phi i32 [ %.pre, %if.then ], [ 0, %while.body ]
>>>> %cmp = icmp eq i32 %1, 0
>>>> br i1 %cmp, label %while.body, label %while.cond.while.end_crit_edge
>>>> 
>>>> The proper fix is to check all uses of %shouldExit as long as there is
>>>> a possible
>>>> path from the use to "call RunInMode". The proposed patch uses
>>>> LoopInfo to approximate the reachability analysis.
>>>> Suggestions on how to approximate this are welcome.
>>>> 
>>>> Please review, Thanks,
>>> 
>>> Thanks for working on this.
>>> 
>>> I think you should factor out logic for "is Inst2 reachable from Inst1"
>>> and put it somewhere reusable (BasicBlockUtils?). Making it take
>>> DominatorTree* and LoopInfo* is the right idea, but we may not have
>>> those available. If the method requires one of those, it should take it
>>> as a reference instead of a pointer.
>>> 
>>> + AU.addRequired<LoopInfo>();
>>> 
>>> Sorry, I don't think we can make MemoryDependenceAnalysis depend on
>>> LoopInfo.
>>> 
>>> What we can do is check if it just happens to be available, due to the
>>> pass ordering we happen to be running in. If it is,
>>> getAnalysisIfAvailable<LoopInfo>() will return the pointer and we can
>>> use it to speed up the search, or NULL.
>>> 
>>> I spent a while thinking about whether LoopInfo ever actually helps us
>>> prune the search, both in the case where we have a dominator tree
>>> available and when we don't. I decided it does (in both cases), and I
>>> wanted to quickly jot down what I think the algorithm is:
>>> 
>>> We're searching for whether there exists any path from A to B.
>>> - if we have loop info and not dominator trees, set B = B's loop header
>>> (this is an optimization only).
>>> - push A on worklist.
>>> - pop A' off the worklist and
>>> - if A' and B are not both in the same loop, then A' is a dead end. Skip
>>> to the next item on the worklist.
>>> - if A' dominates B, we have found a path. Stop.
>>> - push all of A''s successors onto the worklist, except the ones we've
>>> visited before.
>>> - if the worklist is empty, there is no path. Stop.
>> 
>> Ignore this. It's both subtly wrong and rather inefficient.
>> 
>> - push A on the worklist.
>> - pop A' off the worklist and
>>  - if there is a loop that contains both A' and B, then there is a path. Stop.
>>  - if A' dominates B, there is a path. Stop.
>>  - if A' is in a loop (and we have loop info), add the outermost loop's exit blocks to the worklist; else (if A' is not in a loop) add A''s successors to the worklist.
>> - if the worklist is empty, there is no path. Stop.
>> 
>> Nick
>> 
>>> Note that "in the same loop" can't be implemented by a simple
>>> getLoopFor(A) == getLoopFor(B), because of nested loops. The test we
>>> really want is "does there exist a Loop* that contains both blocks?".
>>> 
>>> If the dominator tree is not available then the test for "A' dominates
>>> B" becomes "A' == B" with no change in precision. If the loop
>>> information is not available, then we skip that test, again with loss of
>>> precision. We should have some sort of visited-blocks limit to make sure
>>> this doesn't take too much compile time.
>>> 
>>> Nick
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121220/2eeaffc3/attachment.html>