[llvm-commits] [llvm] r101075 - in /llvm/trunk: lib/Target/X86/X86InstrInfo.cpp test/CodeGen/X86/brcond.ll

Mon Apr 12 18:24:18 PDT 2010

On Apr 12, 2010, at 5:18 PM, Chris Lattner wrote:

>>>> // If the prior block branches here on true and somewhere else on false, and
>>>> // if the branch condition is reversible, reverse the branch to create a
>>>> // fall-through.
>>>> if (PriorTBB == MBB) {
>>>>   SmallVector<MachineOperand, 4> NewPriorCond(PriorCond);
>>>>   if (!TII->ReverseBranchCondition(NewPriorCond)) {
>>>>     TII->RemoveBranch(PrevBB);
>>>>     TII->InsertBranch(PrevBB, PriorFBB, 0, NewPriorCond);
>>>>     MadeChange = true;
>>>>     ++NumBranchOpts;
>>>>     goto ReoptimizeBlock;
>>>>   }
>>>> }
>>>> 
>>>> Why doesn't it work in this case?
>>> 
>>> I was curious too. CodeGenPrepare has split a critical edge for a PHI
>>> node, however after register allocation it turns out that no copy was
>>> needed, so the jmp is actually in a separate basic block from the
>>> jne, jp.
>> 
>> 
>> In that case, branch folding should be removing the block that consists of only an unconditional branch:
>> 
>>     // If this block is just an unconditional branch to CurTBB, we can
>>     // usually completely eliminate the block.  The only case we cannot
>>     // completely eliminate the block is when the block before this one
>>     // falls through into MBB and we can't understand the prior block's branch
>>     // condition.
>>     if (MBB->empty()) {
>>        <too long to quote, but appears to do what the comment says>
>> 
>> so why doesn't *that* work?
>> 
> 
> I think that answering these questions (and fixing the problem) is a much better idea than hacking the X86 backend to handle one specific case of this.
> 
The problem lies in this bit of code in X86InstrInfo::AnalyzeBranch:

    // If they differ, see if they fit one of the known patterns. Theoretically,                                                                                        
    // we could handle more patterns here, but we shouldn't expect to see them                                                                                          
    // if instruction selection has done a reasonable job.                                                                                                              
    if ((OldBranchCode == X86::COND_NP &&
         BranchCode == X86::COND_E) ||
        (OldBranchCode == X86::COND_E &&
         BranchCode == X86::COND_NP))
      BranchCode = X86::COND_NP_OR_E;
    else if ((OldBranchCode == X86::COND_P &&
              BranchCode == X86::COND_NE) ||
             (OldBranchCode == X86::COND_NE &&
              BranchCode == X86::COND_P))
      BranchCode = X86::COND_NE_OR_P;
    else
      return true;

When the code in BranchFolding calls "ReverseBranchCondition", the X86 back-end isn't able to handle it. So it doesn't change the branch conditions.

If you look for the bits of code that handle COND_NP_OR_E and COND_NE_OR_P there doesn't seem to be any optimizations done with them. X86InstrInfo::InsertBranch looks at them and inserts two branches.

What is the rationale behind converting a JE/JNP and JNE/JP into COND_NE_OR_P and COND_NP_OR_E?

-bw

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20100412/f6f0836b/attachment.html>