[llvm-commits] [PATCH] Fix a miscompile in memory dependency analysis

Nick Lewycky nicholas at mxc.ca
Fri Dec 28 18:43:12 PST 2012


On 12/20/2012 08:03 AM, Manman Ren wrote:
> Ping
>
> On Dec 19, 2012, at 10:05 AM, Manman Ren <mren at apple.com
> <mailto:mren at apple.com>> wrote:
>
>>
>> Hi Nick,
>>
>> Thanks for having the detailed algorithm.
>> According to Chandler (and I agree),
>>
>> We can't rely on LoopInfo to describe all backedges as LoopInfo
>> doesn't represent unnatural loops -- essentially it isn't guaranteed
>> to find all cycles.

It isn't guaranteed to find all cycles, but all backedges have a Loop* 
object associated with them. The issue is that two cycles may only have 
one Loop object. Consider this CFG between three block a, b and c:

    a
  /^ \^
v/   v\
b     c

(or a -> {b, c}, b -> a, c-> a for those without ASCII art). There's one 
cycle a<->b and a second cycle a<->c. We get one Loop with a, b and c in 
it. That's exactly what we want, because you can reach b from c and c 
from b, by going through a. We can make use of LoopInfo.

>> I now have a simple implementation of a very conservative reachablity
>> analysis:
>> + // Conservatively return true. Return false, if there is a single path
>> + // starting from "From" and the path does not reach "To".
>> + static bool hasPath(const BasicBlock *From, const BasicBlock *To) {
>> +   const unsigned MaxCheck = 5;
>> +   const BasicBlock *Current = From;
>> +   for (unsigned I = 0; I < MaxCheck; I++) {
>> +     unsigned NumSuccs = Current->getTerminator()->getNumSuccessors();
>> +     if (NumSuccs > 1)
>> +       return true;
>> +     if (NumSuccs == 0)
>> +       return false;
>> +     Current = Current->getTerminator()->getSuccessor(0);
>> +     if (Current == To)
>> +       return true;
>> +   }
>> +   return true;
>> + }

Okay. As advertised, this is indeed very conservative. That's fine for 
now, we can optimize harder later. Let's get the miscompile fixed!

>> And I will file a PR to request a more advanced reachability analysis.
>> Please review the attached patch,

Please commit!

Nick

>>
>> Thanks,
>> Manman
>>
>> <mem_dep_2.patch>
>>
>> On Dec 19, 2012, at 2:07 AM, Nick Lewycky wrote:
>>
>>> Nick Lewycky wrote:
>>>> Manman Ren wrote:
>>>>>
>>>>> Given input:
>>>>> while.body:
>>>>> call void @RunInMode(i32 100) nounwind
>>>>> %1 = load i32* %tasksIdle, align 4
>>>>> %tobool = icmp eq i32 %1, 0
>>>>> br i1 %tobool, label %while.cond.backedge, label %if.then
>>>>>
>>>>> if.then:
>>>>> store i32 0, i32* %tasksIdle, align 4
>>>>> call void @TimerCreate(i32* %shouldExit) nounwind
>>>>> br label %while.cond.backedge
>>>>>
>>>>> while.cond.backedge:
>>>>> %2 = load i32* %shouldExit, align 4
>>>>> %cmp = icmp eq i32 %2, 0
>>>>> br i1 %cmp, label %while.body, label %while.cond.while.end_crit_edge
>>>>>
>>>>> Make sure we are not returning "call RunInMode" does not modify
>>>>> %shouldExit.
>>>>> Currently, we optimize this into
>>>>> while.cond.backedge: ; preds = %if.then, %while.body
>>>>> %1 = phi i32 [ %.pre, %if.then ], [ 0, %while.body ]
>>>>> %cmp = icmp eq i32 %1, 0
>>>>> br i1 %cmp, label %while.body, label %while.cond.while.end_crit_edge
>>>>>
>>>>> The proper fix is to check all uses of %shouldExit as long as there is
>>>>> a possible
>>>>> path from the use to "call RunInMode". The proposed patch uses
>>>>> LoopInfo to approximate the reachability analysis.
>>>>> Suggestions on how to approximate this are welcome.
>>>>>
>>>>> Please review, Thanks,
>>>>
>>>> Thanks for working on this.
>>>>
>>>> I think you should factor out logic for "is Inst2 reachable from Inst1"
>>>> and put it somewhere reusable (BasicBlockUtils?). Making it take
>>>> DominatorTree* and LoopInfo* is the right idea, but we may not have
>>>> those available. If the method requires one of those, it should take it
>>>> as a reference instead of a pointer.
>>>>
>>>> + AU.addRequired<LoopInfo>();
>>>>
>>>> Sorry, I don't think we can make MemoryDependenceAnalysis depend on
>>>> LoopInfo.
>>>>
>>>> What we can do is check if it just happens to be available, due to the
>>>> pass ordering we happen to be running in. If it is,
>>>> getAnalysisIfAvailable<LoopInfo>() will return the pointer and we can
>>>> use it to speed up the search, or NULL.
>>>>
>>>> I spent a while thinking about whether LoopInfo ever actually helps us
>>>> prune the search, both in the case where we have a dominator tree
>>>> available and when we don't. I decided it does (in both cases), and I
>>>> wanted to quickly jot down what I think the algorithm is:
>>>>
>>>> We're searching for whether there exists any path from A to B.
>>>> - if we have loop info and not dominator trees, set B = B's loop header
>>>> (this is an optimization only).
>>>> - push A on worklist.
>>>> - pop A' off the worklist and
>>>> - if A' and B are not both in the same loop, then A' is a dead end. Skip
>>>> to the next item on the worklist.
>>>> - if A' dominates B, we have found a path. Stop.
>>>> - push all of A''s successors onto the worklist, except the ones we've
>>>> visited before.
>>>> - if the worklist is empty, there is no path. Stop.
>>>
>>> Ignore this. It's both subtly wrong and rather inefficient.
>>>
>>> - push A on the worklist.
>>> - pop A' off the worklist and
>>> - if there is a loop that contains both A' and B, then there is a
>>> path. Stop.
>>> - if A' dominates B, there is a path. Stop.
>>> - if A' is in a loop (and we have loop info), add the outermost
>>> loop's exit blocks to the worklist; else (if A' is not in a loop) add
>>> A''s successors to the worklist.
>>> - if the worklist is empty, there is no path. Stop.
>>>
>>> Nick
>>>
>>>> Note that "in the same loop" can't be implemented by a simple
>>>> getLoopFor(A) == getLoopFor(B), because of nested loops. The test we
>>>> really want is "does there exist a Loop* that contains both blocks?".
>>>>
>>>> If the dominator tree is not available then the test for "A' dominates
>>>> B" becomes "A' == B" with no change in precision. If the loop
>>>> information is not available, then we skip that test, again with loss of
>>>> precision. We should have some sort of visited-blocks limit to make sure
>>>> this doesn't take too much compile time.
>>>>
>>>> Nick
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list