[LLVMdev] BackedgeTakenCount calculation for fortran loops and DragonEgg gfortran-4.6

Mon Feb 13 16:09:15 PST 2012

Marcello,

Nick pointed this thread out to me. I've had a patch laying around for a while that should solve your problem.
See r150439. It appears to work on your test case. Can you verify that it works for you?

I'm not discounting the idea of fixing ComputeExitLimit. In fact, I'm reviewing that now. But at some point we need to acknowledge that it's the wrong approach and stop piling on CFG patterns.

-Andy

On Feb 9, 2012, at 1:39 AM, Marcello Maggioni <hayarms at gmail.com> wrote:

> FInally I had the time to complete everything up. Now I included the
> test case in the patch and the testcase runs with the LLVM tests
> system.
> 
> 2012/2/9 Marcello Maggioni <hayarms at gmail.com>:
>> This is instead a very simple (handmade) test case that triggers the
>> problem (attached)
>> Also a more conforming patch has been attached
>> 
>> 2012/2/9 Marcello Maggioni <hayarms at gmail.com>:
>>> This is the .ll for that graph (attached). I think I understand what
>>> you are saying.
>>> This particular testcase returns CNC not because the exit block
>>> doesn't have a unique predecessor, but because the unique predecessor
>>> (the inner loop block) has a successor that is inside the loop (in
>>> this case itself, because it's the inner loop block).
>>> 
>>> That doesn't change, anyway, the assuption that this condition ( an
>>> exit block that jumps to a block with an unconditional jump that jumps
>>> to the loop header and that has as unique predecessor the exit block)
>>> is equivalent to jumping directly to the loop header.
>>> 
>>> 2012/2/9 Nick Lewycky <nlewycky at google.com>:
>>>> On 8 February 2012 15:50, Marcello Maggioni <hayarms at gmail.com> wrote:
>>>>> 
>>>>> Well, it wasn't intended as a "real" patch to be included , but more
>>>>> as a "proof of concept" for a solution. Do you think it is a valid
>>>>> solution and I'm correct in my assumption? If so then I'll clean up
>>>>> the patch and attach a testcase for inclusion.
>>>> 
>>>> 
>>>> I'm not sure -- when I tried to track the IR in your screenshot through the
>>>> code in SCEV I came up with a completely different reason it would return
>>>> the CNC than the one you gave in your email. It would really help to have a
>>>> testcase in .ll format.
>>>> 
>>>> Nick
>>>> 
>>>>> 
>>>>> Thanks!
>>>>> 
>>>>> Marcello
>>>>> 
>>>>> 2012/2/9 Nick Lewycky <nlewycky at google.com>:
>>>>>> Your patch should include a testcase, see test/Analysis/ScalarEvolution
>>>>>> for
>>>>>> examples. "BranchInst* " should be "BranchInst *". You should have
>>>>>> spaces
>>>>>> after the // in your comments. One of the comment lines isn't indented
>>>>>> properly.
>>>>>> 
>>>>>> Nick
>>>>>> 
>>>>>> On 8 February 2012 12:05, Marcello Maggioni <hayarms at gmail.com> wrote:
>>>>>>> 
>>>>>>> Attached
>>>>>>> 
>>>>>>> 2012/2/8 Marcello Maggioni <hayarms at gmail.com>:
>>>>>>>> Mmm, sorry, the patch I posted crashes if ExitBr is null (which it
>>>>>>>> may
>>>>>>>> be ...) , this one should be ok (and passess all the ScalarEvolution
>>>>>>>> tests in LLVM):
>>>>>>>> 
>>>>>>>> diff --git a/lib/Analysis/ScalarEvolution.cpp
>>>>>>>> b/lib/Analysis/ScalarEvolution.cpp
>>>>>>>> index daf7742..b10fab2 100644
>>>>>>>> --- a/lib/Analysis/ScalarEvolution.cpp
>>>>>>>> +++ b/lib/Analysis/ScalarEvolution.cpp
>>>>>>>> @@ -4293,9 +4293,15 @@ ScalarEvolution::ComputeExitLimit(const Loop
>>>>>>>> *L, BasicBlock *ExitingBlock) {
>>>>>>>>   //
>>>>>>>>   // FIXME: we should be able to handle switch instructions (with a
>>>>>>>> single exit)
>>>>>>>>   BranchInst *ExitBr =
>>>>>>>> dyn_cast<BranchInst>(ExitingBlock->getTerminator());
>>>>>>>> +
>>>>>>>>   if (ExitBr == 0) return getCouldNotCompute();
>>>>>>>>   assert(ExitBr->isConditional() && "If unconditional, it can't be in
>>>>>>>> loop!");
>>>>>>>> 
>>>>>>>> +  BranchInst* BrFirstSucc = dyn_cast<BranchInst>(ExitBr->
>>>>>>>> +
>>>>>>>>  getSuccessor(0)->getTerminator());
>>>>>>>> +  BranchInst* BrSecondSucc = dyn_cast<BranchInst>(ExitBr->
>>>>>>>> +
>>>>>>>>  getSuccessor(1)->getTerminator());
>>>>>>>> +
>>>>>>>>   // At this point, we know we have a conditional branch that
>>>>>>>> determines whether
>>>>>>>>   // the loop is exited.  However, we don't know if the branch is
>>>>>>>> executed each
>>>>>>>>   // time through the loop.  If not, then the execution count of the
>>>>>>>> branch will
>>>>>>>> @@ -4316,10 +4322,23 @@ ScalarEvolution::ComputeExitLimit(const Loop
>>>>>>>> *L, BasicBlock *ExitingBlock) {
>>>>>>>>   if (ExitBr->getSuccessor(0) != L->getHeader() &&
>>>>>>>>       ExitBr->getSuccessor(1) != L->getHeader() &&
>>>>>>>>       ExitBr->getParent() != L->getHeader()) {
>>>>>>>> -    // The simple checks failed, try climbing the unique predecessor
>>>>>>>> chain
>>>>>>>> -    // up to the header.
>>>>>>>> +
>>>>>>>>     bool Ok = false;
>>>>>>>> -    for (BasicBlock *BB = ExitBr->getParent(); BB; ) {
>>>>>>>> +    //Check if the one of the successor of the exit branch has the
>>>>>>>> is a
>>>>>>>> block
>>>>>>>> +    //that has only one predecessor and has an unconditional branch
>>>>>>>> to
>>>>>>>> the
>>>>>>>> +    //loop header
>>>>>>>> +    if (BrFirstSucc && BrFirstSucc->isUnconditional() &&
>>>>>>>> +        BrFirstSucc->getSuccessor(0) == L->getHeader() &&
>>>>>>>> +        BrFirstSucc->getParent()->getUniquePredecessor())
>>>>>>>> +      Ok = true;
>>>>>>>> +    if (BrSecondSucc && BrSecondSucc->isUnconditional() &&
>>>>>>>> +        BrSecondSucc->getSuccessor(0) == L->getHeader() &&
>>>>>>>> +        BrSecondSucc->getParent()->getUniquePredecessor())
>>>>>>>> +      Ok = true;
>>>>>>>> +     // The simple checks failed, try climbing the unique
>>>>>>>> predecessor
>>>>>>>> chain
>>>>>>>> +    // up to the header.
>>>>>>>> +
>>>>>>>> +    for (BasicBlock *BB = ExitBr->getParent(); BB && !Ok; ) {
>>>>>>>>       BasicBlock *Pred = BB->getUniquePredecessor();
>>>>>>>>       if (!Pred)
>>>>>>>>         return getCouldNotCompute();
>>>>>>>> 
>>>>>>>> anyway, this patch is only "a concept" of what I'm talking about.
>>>>>>>> 
>>>>>>>> PS=Sorry for the bad english in the previous post :p
>>>>>>>> 
>>>>>>>> 2012/2/8 Marcello Maggioni <hayarms at gmail.com>:
>>>>>>>>> Hello, I'm finding problems with BackEdgeTaken count calculation in
>>>>>>>>> even simple fortran loops with gfortran-4.6 + DragonEgg 3.0.
>>>>>>>>> 
>>>>>>>>> Even for simple double loops like this one:
>>>>>>>>> 
>>>>>>>>>       program test2
>>>>>>>>>       integer i,j,k
>>>>>>>>>       dimension k(100,100)
>>>>>>>>>       do j=1,100
>>>>>>>>>        do i=1,100
>>>>>>>>>         k(i,j) = i
>>>>>>>>>        enddo
>>>>>>>>>       enddo
>>>>>>>>>       write(*,*) k(1,30)
>>>>>>>>>       end
>>>>>>>>> 
>>>>>>>>> make the ScalarEvolution engine return "CouldNotCompute" even for
>>>>>>>>> the
>>>>>>>>> outer loop (the inner loop is fine).
>>>>>>>>> 
>>>>>>>>> You can find a screenshot of the translation of this loop here (with
>>>>>>>>> -view-cfg Polly version):
>>>>>>>>> http://i.imgur.com/Jyaqd.png
>>>>>>>>> 
>>>>>>>>> The problem seems to be the fact that the ScalarEvolution can't
>>>>>>>>> consider the outer loop exit branch instruction as the trivial case
>>>>>>>>> (where one of the successors of the exit block is the loop header or
>>>>>>>>> the exit block is the loop header itself) because of the (strange?)
>>>>>>>>> loop shape (the exit block instead of jumping to the header of the
>>>>>>>>> loop jumps instead to another block that increments the induction
>>>>>>>>> variable and has an unconditional branch to the loop header) and so
>>>>>>>>> starts backtracking the predecessors of the of the exit block and
>>>>>>>>> stops when it reaches the inner loop that has a block without a
>>>>>>>>> unique
>>>>>>>>> predecessor.
>>>>>>>>> 
>>>>>>>>> What do you think about this problem? This makes , for example,
>>>>>>>>> difficult analyzing even simple fortran loops with Polly .
>>>>>>>>> I believe the case portrayed in the picture is the same to the
>>>>>>>>> trivial
>>>>>>>>> case (because the exit block jumps to a block with an unconditional
>>>>>>>>> jump to the header of the loop), am I right?
>>>>>>>>> 
>>>>>>>>> I've written this little patch that adds this case to the trivial
>>>>>>>>> case
>>>>>>>>> :
>>>>>>>>> 
>>>>>>>>> diff --git a/lib/Analysis/ScalarEvolution.cpp
>>>>>>>>> b/lib/Analysis/ScalarEvolution.cpp
>>>>>>>>> index daf7742..fcbaffe 100644
>>>>>>>>> --- a/lib/Analysis/ScalarEvolution.cpp
>>>>>>>>> +++ b/lib/Analysis/ScalarEvolution.cpp
>>>>>>>>> @@ -4293,6 +4293,11 @@ ScalarEvolution::ComputeExitLimit(const Loop
>>>>>>>>> *L, BasicBlock *ExitingBlock) {
>>>>>>>>>   //
>>>>>>>>>   // FIXME: we should be able to handle switch instructions (with a
>>>>>>>>> single exit)
>>>>>>>>>   BranchInst *ExitBr =
>>>>>>>>> dyn_cast<BranchInst>(ExitingBlock->getTerminator());
>>>>>>>>> +  BranchInst* BrFirstSucc = dyn_cast<BranchInst>(ExitBr->
>>>>>>>>> +
>>>>>>>>>  getSuccessor(0)->getTerminator());
>>>>>>>>> +  BranchInst* BrSecondSucc = dyn_cast<BranchInst>(ExitBr->
>>>>>>>>> +
>>>>>>>>>  getSuccessor(1)->getTerminator());
>>>>>>>>> +
>>>>>>>>>   if (ExitBr == 0) return getCouldNotCompute();
>>>>>>>>>   assert(ExitBr->isConditional() && "If unconditional, it can't be
>>>>>>>>> in
>>>>>>>>> loop!");
>>>>>>>>> 
>>>>>>>>> @@ -4315,8 +4320,12 @@ ScalarEvolution::ComputeExitLimit(const Loop
>>>>>>>>> *L, BasicBlock *ExitingBlock) {
>>>>>>>>>   //
>>>>>>>>>   if (ExitBr->getSuccessor(0) != L->getHeader() &&
>>>>>>>>>       ExitBr->getSuccessor(1) != L->getHeader() &&
>>>>>>>>> -      ExitBr->getParent() != L->getHeader()) {
>>>>>>>>> -    // The simple checks failed, try climbing the unique
>>>>>>>>> predecessor
>>>>>>>>> chain
>>>>>>>>> +      ExitBr->getParent() != L->getHeader() &&
>>>>>>>>> +      !((BrFirstSucc && BrFirstSucc->isUnconditional() &&
>>>>>>>>> +         BrFirstSucc->getSuccessor(0) == L->getHeader()) ||
>>>>>>>>> +        (BrSecondSucc && BrSecondSucc->isUnconditional() &&
>>>>>>>>> +         BrSecondSucc->getSuccessor(0) == L->getHeader())) ) {
>>>>>>>>> +     // The simple checks failed, try climbing the unique
>>>>>>>>> predecessor
>>>>>>>>> chain
>>>>>>>>>     // up to the header.
>>>>>>>>>     bool Ok = false;
>>>>>>>>>     for (BasicBlock *BB = ExitBr->getParent(); BB; ) {
>>>>>>>>> 
>>>>>>>>> what do you think about this? There is a better solution to the
>>>>>>>>> problem? Is the compiler itself broken?
>>>>>>>>> 
>>>>>>>>> Thank you
>>>>>>>>> 
>>>>>>>>> Marcello
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>> 
>>>>>> 
>>>> 
>>>> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev