[llvm] r271925 - [MBP] Reduce code size by running tail merging in MBP.

Mikael Holmén via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 16 22:13:48 PDT 2016


Hi,

On 06/16/2016 11:48 PM, Haicheng Wu wrote:
> Hi Mikael,
>
> Any luck of fixing this?
>
> I am thinking if SomeVeryCleverCondition = !canFallThrough() is good enouth.  Or more explicitly and safely, SomeVeryCleverCondition = (succ_size() == 1 && !isLayoutSuccessor(TBB))

We're thinking along those lines as well. Currently we do

   if (!FallthroughBB && (succ_size() == 1)) {

and I haven't seen any problems with it so far but I'm not sure at all 
if it's safe and solid.

Regards,
Mikael

>
> Best,
>
> Haicheng
>
> -----Original Message-----
> From: Mikael Holmén [mailto:mikael.holmen at ericsson.com]
> Sent: Wednesday, June 15, 2016 5:28 AM
> To: Haicheng Wu <haicheng at codeaurora.org>; Chandler Carruth <chandlerc at gmail.com>; grosbach at apple.com
> Cc: 'Karl-Johan Karlsson' <karl-johan.karlsson at ericsson.com>; 'llvm-commits' <llvm-commits at lists.llvm.org>
> Subject: Re: [llvm] r271925 - [MBP] Reduce code size by running tail merging in MBP.
>
> Hi Chandler, Jim, Haicheng,
>
> (Adding Chandler Carruth and Jim Grosbach who seems to have fixed a somewhat similar case in updateTerminator years ago.)
>
> On 06/14/2016 11:24 PM, Haicheng Wu wrote:
>> >From your analysis, it seems updateTerminator() misses a case.  But I am surprised that it does not happen in other backends.  It would be ideal if you could reproduce it in other backend that is in tree.
>
> Yes it surely would. Reproducing bugs far into the backend on other targets is quite tricky though and we haven't mananged yet with this one :/
>
> Chandler, Jim, do you have any opinion about this?
>
> The problem we're seeing is:
>
> MachineBasicBlock::updateTerminator fails to fix a case where MachineBlockPlacement has reordered blocks so we have
>
> BB#2: derived from LLVM BB %.split.us.split.us
>       Live Ins: %a0_32 %af1 %CCReg %af1
>       Predecessors according to CFG: BB#1
>           brr_cond <BB#3>, pred:2, pred:%af1, pred:0
>       Successors according to CFG: BB#3(0x80000000 / 0x80000000 = 100.00%)
>
> BB#6: derived from LLVM BB %.split
>       Live Ins: %CCReg %af1
>       Predecessors according to CFG: BB#0
>           brr_cond <BB#13>, pred:3, pred:%af1, pred:0
>       Successors according to CFG: BB#7(0x40000000 / 0x80000000 = 50.00%)
> BB#13(0x40000000 / 0x80000000 = 50.00%)
>
> and then
>
> BB#3: derived from LLVM BB %bb1.us.us.us
>       Live Ins: %a0_32 %af1
>       Predecessors according to CFG: BB#2 BB#10
>       %a1_40<def,dead> = negh_a16_a32 %a0l, pred:0, pred:%noreg, pred:0, %CCReg<imp-def>, %cuc<imp-use>, %af1<imp-def>
>       brr_cond <BB#5>, pred:6, pred:%af1, pred:0
>       Successors according to CFG: BB#5(0x30000000 / 0x80000000 = 37.50%)
> BB#4(0x50000000 / 0x80000000 = 62.50%)
>
> Before moving BBs around, BB#2 fallthrough'd to BB#3, and MachineBlockPlacement expects updateTerminator to insert a branch from
> BB#2 to BB#3 for the old fallthrough edge, but it doesn't.
>
> Note that there is also a conditional branch from BB#2 to BB#3, complicating the situation since the fallthrough bb and the conditional jump target are the same.
>
> AnalyzeBranch sets TBB to BB#3, FBB to nullptr and Cond to the condition used in the brr_cond in BB#2, so that seems reasonable to me.
>
> In updateTerminator, when searching for FallthroughBB it doesn't find anything since the fallthrough bb is the same as TBB. And then
> canFallThrough() returns false, because BB#6 is not in the successor list of BB#2.
>
> Then TBB is not a layout successor and we test
> !isLayoutSuccessor(FallthroughBB) which happens to return true when FallthroughBB is nullptr, so we enter the last else if which removes our conditional branch and inserts an identical one.
>
> To fix BB#2 I think we would like to do
>
>     if (!FallthroughBB && SomeVeryCleverCondition) {
>       // We did fallthrough to the same basic block as the conditional jump
>       // targets but since canFallThrough() said false, our previous
>       // fallthrough successor BB has been moved away.
>       // Remove the conditional jump, leaving unconditional fallthrough.
>       TII->RemoveBranch(*this);
>
>       // Finally update the unconditional successor to be reached via a branch if
>       // it would not be reached by fallthrough.
>       if (!isLayoutSuccessor(TBB)) {
>         ArrayRef<MachineOperand> EmptyCond;
>         TII->InsertBranch(*this, TBB, nullptr, EmptyCond, DL);
>       }
>       return;
>     }
>
> but the question is what SomeVeryCleverCondition should look like to catch this and only this case and not mess up something else.
>
> Any thoughts appreciated.
>
> Thanks,
> Mikael
>
>>
>> Best,
>>
>> Haicheng
>>
>> -----Original Message-----
>> From: Mikael Holmén [mailto:mikael.holmen at ericsson.com]
>> Sent: Tuesday, June 14, 2016 12:56 PM
>> To: Haicheng Wu <haicheng at codeaurora.org>
>> Cc: Karl-Johan Karlsson <karl-johan.karlsson at ericsson.com>;
>> llvm-commits <llvm-commits at lists.llvm.org>
>> Subject: Re: [llvm] r271925 - [MBP] Reduce code size by running tail merging in MBP.
>>
>> Hi,
>>
>> Karl-Johan got it right.
>>
>> TBB is BB#3, FBB is nullptr, and Cond is the condition set on the brr_cond instruction ending BB#2.
>>
>> And this makes updateTerminator pass everything until
>>
>>      // The block has a fallthrough conditional branch.
>>      if (isLayoutSuccessor(TBB)) {
>>        if (TII->ReverseBranchCondition(Cond)) {
>>          // We can't reverse the condition, add an unconditional branch.
>>          Cond.clear();
>>          TII->InsertBranch(*this, FallthroughBB, nullptr, Cond, DL);
>>          return;
>>        }
>>        TII->RemoveBranch(*this);
>>        TII->InsertBranch(*this, FallthroughBB, nullptr, Cond, DL);
>>      } else if (!isLayoutSuccessor(FallthroughBB)) {
>>        TII->RemoveBranch(*this);
>>        TII->InsertBranch(*this, TBB, FallthroughBB, Cond, DL);
>>      }
>>
>> where we enter the else branch (FallthroughBB is nullptr), remove the conditiopnal branch, but then insert a new (identical) conditional branch.
>>
>> Regards,
>> Mikael
>>
>> On 06/14/2016 06:08 PM, Karl-Johan Karlsson wrote:
>>> Hi!
>>>
>>> Mikael went home from work before your mail arrived. He will be back
>>> tomorrow. I hope I have recreated the problem as Mikael described it
>>> below. From my debugger session I printed the following information
>>> about TBB, FBB and Cond before updateTerminator() is called on BB#2.
>>>
>>> (gdb) p PrevBB->dump()
>>> $39 = void
>>> (gdb)
>>> BB#2: derived from LLVM BB %.split.us.split.us
>>>        Live Ins: %a0_32 %af1 %CCReg %af1
>>>        Predecessors according to CFG: BB#1
>>>        brr_cond <BB#3>, pred:2, pred:%af1, pred:0
>>>        Successors according to CFG: BB#3(0x80000000 / 0x80000000 =
>>> 100.00%)
>>> (gdb) P TBB
>>> $40 = (llvm::MachineBasicBlock *) 0x28e02d8
>>> (gdb) P TBB->dump()
>>> $41 = void
>>> (gdb)
>>> BB#3: derived from LLVM BB %bb1.us.us.us
>>>        Live Ins: %a0_32 %af1
>>>        Predecessors according to CFG: BB#2 BB#10
>>>        %a1_40<def,dead> = negh_a16_a32 %a0l, pred:0, pred:%noreg,
>>> pred:0, %CCReg<imp-def>, %cuc<imp-use>, %af1<imp-def>
>>>        brr_cond <BB#5>, pred:6, pred:%af1, pred:0
>>>        Successors according to CFG: BB#5(0x30000000 / 0x80000000 =
>>> 37.50%)
>>> BB#4(0x50000000 / 0x80000000 = 62.50%)
>>> (gdb) P FBB
>>> $42 = (llvm::MachineBasicBlock *) 0x0
>>> (gdb)
>>> (gdb) P Cond.empty()
>>> $44 = false
>>> (gdb)
>>>
>>> If I have got any information wrong Mikael will correct me with a
>>> mail tomorrow.
>>>
>>> Regards
>>> / Karl-Johan Karlsson
>>>
>>>
>>> On 2016-06-14 17:07, Haicheng Wu wrote:
>>>> Hi Mikael,
>>>>
>>>>    From your dumped information, I think the problem is either in
>>>> AnalyzeBranch() or updateTerminator().  What is the TBB, FBB, and
>>>> Cond returned by AnalyzeBranch() when it is called on BB#2 before
>>>> calling updateTerminator()?
>>>>
>>>> Best,
>>>>
>>>> Haicheng
>>>>
>>>> -----Original Message-----
>>>> From: Mikael Holmén [mailto:mikael.holmen at ericsson.com]
>>>> Sent: Tuesday, June 14, 2016 7:54 AM
>>>> To: Haicheng Wu <haicheng at codeaurora.org>
>>>> Cc: llvm-commits <llvm-commits at lists.llvm.org>; Karl-Johan Karlsson
>>>> <karl-johan.karlsson at ericsson.com>
>>>> Subject: Re: [llvm] r271925 - [MBP] Reduce code size by running tail
>>>> merging in MBP.
>>>>
>>>> Hi,
>>>>
>>>> We're seeing a problem on our out-of-tree backend with this patch
>>>> applied. I think the problem is either in
>>>> MachineBasicBlock::updateTerminator() or how it's used but I'm not
>>>> sure who's responsibility it is to fix the situation.
>>>>
>>>> I've tried to reproduce it on in-tree backends but I've failed on
>>>> X86 and Arm :/
>>>>
>>>> Anyway, the fault we see is that after MachineBlockPlacement we get
>>>> two BBs like this:
>>>>
>>>> BB#2: derived from LLVM BB %.split.us.split.us
>>>>         Live Ins: %a0_32 %af1 %CCReg %af1
>>>>         Predecessors according to CFG: BB#1
>>>>             brr_cond <BB#3>, pred:2, pred:%af1, pred:0
>>>>         Successors according to CFG: BB#3(0x80000000 / 0x80000000 =
>>>> 100.00%)
>>>>
>>>> BB#6: derived from LLVM BB %.split
>>>>         Live Ins: %CCReg %af1
>>>>         Predecessors according to CFG: BB#0
>>>>             brr_cond <BB#13>, pred:3, pred:%af1, pred:0
>>>>         Successors according to CFG: BB#7(0x40000000 / 0x80000000 =
>>>> 50.00%)
>>>> BB#13(0x40000000 / 0x80000000 = 50.00%)
>>>>
>>>> So, looking at the instructions of BB#2 it looks like it does
>>>> fallthrough to BB#6 (only one conditional branch to BB#3) but the
>>>> CFG says only BB#3 is successor.
>>>>
>>>> And the verifier complains about this:
>>>>
>>>> *** Bad machine code: MBB exits via conditional branch/fall-through
>>>> but only has one CFG successor! ***
>>>> - function:    f2
>>>> - basic block: BB#2 .split.us.split.us (0x492c1f8) LLVM ERROR: Found
>>>> 1 machine code errors.
>>>>
>>>> At some point during MachineBlockPlacement we have
>>>>
>>>> BB#2: derived from LLVM BB %.split.us.split.us
>>>>         Live Ins: %a0_32 %af1 %CCReg %af1
>>>>         Predecessors according to CFG: BB#1
>>>>       brr_cond <BB#3>, pred:2, pred:%af1, pred:0
>>>>         Successors according to CFG: BB#3(0x80000000 / 0x80000000 =
>>>> 100.00%)
>>>>
>>>> BB#3: derived from LLVM BB %bb1.us.us.us
>>>>         Live Ins: %a0_32 %af1
>>>>         Predecessors according to CFG: BB#2 BB#10
>>>>       %a1_40<def,dead> = negh_a16_a32 %a0l, pred:0, pred:%noreg,
>>>> pred:0, %CCReg<imp-def>, %cuc<imp-use>, %af1<imp-def>
>>>>       brr_cond <BB#5>, pred:6, pred:%af1, pred:0
>>>>         Successors according to CFG: BB#5(0x30000000 / 0x80000000 =
>>>> 37.50%)
>>>> BB#4(0x50000000 / 0x80000000 = 62.50%)
>>>>
>>>> Note that BB#2 both does a conditional jump AND falltrhoughs to BB#3.
>>>>
>>>> Then it's decided by MachineBlockPlacement::buildChain that BB#6 is
>>>> a better layout successor to BB#2 than BB#3, and we get:
>>>>
>>>> BB#2: derived from LLVM BB %.split.us.split.us
>>>>         Live Ins: %a0_32 %af1 %CCReg %af1
>>>>         Predecessors according to CFG: BB#1
>>>>       brr_cond <BB#3>, pred:2, pred:%af1, pred:0
>>>>         Successors according to CFG: BB#3(0x80000000 / 0x80000000 =
>>>> 100.00%)
>>>>
>>>> BB#6: derived from LLVM BB %.split
>>>>         Live Ins: %CCReg %af1
>>>>         Predecessors according to CFG: BB#0
>>>>       brr_cond <BB#13>, pred:3, pred:%af1, pred:0
>>>>         Successors according to CFG: BB#7(0x40000000 / 0x80000000 =
>>>> 50.00%)
>>>> BB#13(0x40000000 / 0x80000000 = 50.00%)
>>>>
>>>> Then in MachineBlockPlacement::buildCFGChains we have
>>>>
>>>>         if (!TII->AnalyzeBranch(*PrevBB, TBB, FBB, Cond)) {
>>>>           PrevBB->updateTerminator();
>>>>
>>>> which is called on BB#2, and from what I understand the point is
>>>> that updateTerminator should insert an unconditional branch from
>>>> BB#2 to
>>>> BB#3 now that there is no fallthrough anymore, but it doesn't and
>>>> the code is left broken.
>>>>
>>>> What's your thought's on this Haicheng, who should fix BB#2 in this case?
>>>>
>>>> Thanks,
>>>> Mikael
>>>>
>>>>
>>>> On 06/06/2016 08:36 PM, Haicheng Wu via llvm-commits wrote:
>>>>> Author: haicheng
>>>>> Date: Mon Jun  6 13:36:07 2016
>>>>> New Revision: 271925
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=271925&view=rev
>>>>> Log:
>>>>> [MBP] Reduce code size by running tail merging in MBP.
>>>>>
>>>>> The code layout that TailMerging (inside BranchFolding) works on is
>>>>> not the final layout optimized based on the branch probability.
>>>>> Generally, after BlockPlacement, many new merging opportunities emerge.
>>>>>
>>>>> This patch calls Tail Merging after MBP and calls MBP again if Tail
>>>>> Merging merges anything.
>>>>>
>>>>> Differential Revision: http://reviews.llvm.org/D20276
>>>>>
>>>>> Added:
>>>>>         llvm/trunk/test/CodeGen/AArch64/tailmerging_in_mbp.ll
>>>>> Modified:
>>>>>         llvm/trunk/lib/CodeGen/BranchFolding.cpp
>>>>>         llvm/trunk/lib/CodeGen/BranchFolding.h
>>>>>         llvm/trunk/lib/CodeGen/MachineBlockPlacement.cpp
>>>>>         llvm/trunk/test/CodeGen/ARM/arm-and-tst-peephole.ll
>>>>>
>>>>> Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFo
>>>>> l di ng.cpp?rev=271925&r1=271924&r2=271925&view=diff
>>>>> ===================================================================
>>>>> =
>>>>> ==
>>>>> ========
>>>>> --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original)
>>>>> +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Mon Jun  6 13:36:07
>>>>> +++ 2016
>>>>> @@ -27,6 +27,7 @@
>>>>>      #include "llvm/CodeGen/MachineFunctionPass.h"
>>>>>      #include "llvm/CodeGen/MachineJumpTableInfo.h"
>>>>>      #include "llvm/CodeGen/MachineMemOperand.h"
>>>>> +#include "llvm/CodeGen/MachineLoopInfo.h"
>>>>>      #include "llvm/CodeGen/MachineModuleInfo.h"
>>>>>      #include "llvm/CodeGen/MachineRegisterInfo.h"
>>>>>      #include "llvm/CodeGen/Passes.h"
>>>>> @@ -137,6 +138,8 @@ void BranchFolder::RemoveDeadBlock(Machi
>>>>>        // Remove the block.
>>>>>        MF->erase(MBB);
>>>>>        FuncletMembership.erase(MBB);
>>>>> +  if (MLI)
>>>>> +    MLI->removeBlock(MBB);
>>>>>      }
>>>>>
>>>>>      /// OptimizeImpDefsBlock - If a basic block is just a bunch of
>>>>> implicit_def @@ -193,18 +196,22 @@ bool
>>>>> BranchFolder::OptimizeImpDefsBlock(
>>>>>      }
>>>>>
>>>>>      /// OptimizeFunction - Perhaps branch folding, tail merging and
>>>>> other -/// CFG optimizations on the given function.
>>>>> +/// CFG optimizations on the given function.  Block placement
>>>>> +changes the layout /// and may create new tail merging opportunities.
>>>>>      bool BranchFolder::OptimizeFunction(MachineFunction &MF,
>>>>>                                          const TargetInstrInfo *tii,
>>>>>                                          const TargetRegisterInfo *tri,
>>>>> -                                    MachineModuleInfo *mmi) {
>>>>> +                                    MachineModuleInfo *mmi,
>>>>> +                                    MachineLoopInfo *mli, bool
>>>>> + AfterPlacement) {
>>>>>        if (!tii) return false;
>>>>>
>>>>>        TriedMerging.clear();
>>>>>
>>>>> +  AfterBlockPlacement = AfterPlacement;
>>>>>        TII = tii;
>>>>>        TRI = tri;
>>>>>        MMI = mmi;
>>>>> +  MLI = mli;
>>>>>        RS = nullptr;
>>>>>
>>>>>        // Use a RegScavenger to help update liveness when required.
>>>>> @@ -230,7 +237,10 @@ bool BranchFolder::OptimizeFunction(Mach
>>>>>        bool MadeChangeThisIteration = true;
>>>>>        while (MadeChangeThisIteration) {
>>>>>          MadeChangeThisIteration    = TailMergeBlocks(MF);
>>>>> -    MadeChangeThisIteration   |= OptimizeBranches(MF);
>>>>> +    // No need to clean up if tail merging does not change
>>>>> + anything
>>>>> after the
>>>>> +    // block placement.
>>>>> +    if (!AfterBlockPlacement || MadeChangeThisIteration)
>>>>> +      MadeChangeThisIteration |= OptimizeBranches(MF);
>>>>>          if (EnableHoistCommonCode)
>>>>>            MadeChangeThisIteration |= HoistCommonCode(MF);
>>>>>          MadeChange |= MadeChangeThisIteration; @@ -447,6 +457,11 @@
>>>>> MachineBasicBlock *BranchFolder::SplitMB
>>>>>        // Splice the code over.
>>>>>        NewMBB->splice(NewMBB->end(), &CurMBB, BBI1, CurMBB.end());
>>>>>
>>>>> +  // NewMBB belongs to the same loop as CurMBB.
>>>>> +  if (MLI)
>>>>> +    if (MachineLoop *ML = MLI->getLoopFor(&CurMBB))
>>>>> +      ML->addBasicBlockToLoop(NewMBB, MLI->getBase());
>>>>> +
>>>>>        // NewMBB inherits CurMBB's block frequency.
>>>>>        MBBFreqInfo.setBlockFreq(NewMBB,
>>>>> MBBFreqInfo.getBlockFreq(&CurMBB));
>>>>>
>>>>> @@ -934,23 +949,27 @@ bool BranchFolder::TailMergeBlocks(Machi
>>>>>        if (!EnableTailMerge) return MadeChange;
>>>>>
>>>>>        // First find blocks with no successors.
>>>>> -  MergePotentials.clear();
>>>>> -  for (MachineBasicBlock &MBB : MF) {
>>>>> +  // Block placement does not create new tail merging
>>>>> + opportunities for these  // blocks.
>>>>> +  if (!AfterBlockPlacement) {
>>>>> +    MergePotentials.clear();
>>>>> +    for (MachineBasicBlock &MBB : MF) {
>>>>> +      if (MergePotentials.size() == TailMergeThreshold)
>>>>> +        break;
>>>>> +      if (!TriedMerging.count(&MBB) && MBB.succ_empty())
>>>>> +
>>>>> MergePotentials.push_back(MergePotentialsElt(HashEndOfMBB(MBB),
>>>>> &MBB));
>>>>> +    }
>>>>> +
>>>>> +    // If this is a large problem, avoid visiting the same basic blocks
>>>>> +    // multiple times.
>>>>>          if (MergePotentials.size() == TailMergeThreshold)
>>>>> -      break;
>>>>> -    if (!TriedMerging.count(&MBB) && MBB.succ_empty())
>>>>> -
>>>>> MergePotentials.push_back(MergePotentialsElt(HashEndOfMBB(MBB),
>>>>> &MBB));
>>>>> -  }
>>>>> +      for (unsigned i = 0, e = MergePotentials.size(); i != e; ++i)
>>>>> +        TriedMerging.insert(MergePotentials[i].getBlock());
>>>>>
>>>>> -  // If this is a large problem, avoid visiting the same basic
>>>>> blocks
>>>>> -  // multiple times.
>>>>> -  if (MergePotentials.size() == TailMergeThreshold)
>>>>> -    for (unsigned i = 0, e = MergePotentials.size(); i != e; ++i)
>>>>> -      TriedMerging.insert(MergePotentials[i].getBlock());
>>>>> -
>>>>> -  // See if we can do any tail merging on those.
>>>>> -  if (MergePotentials.size() >= 2)
>>>>> -    MadeChange |= TryTailMergeBlocks(nullptr, nullptr);
>>>>> +    // See if we can do any tail merging on those.
>>>>> +    if (MergePotentials.size() >= 2)
>>>>> +      MadeChange |= TryTailMergeBlocks(nullptr, nullptr);  }
>>>>>
>>>>>        // Look at blocks (IBB) with multiple predecessors (PBB).
>>>>>        // We change each predecessor to a canonical form, by @@
>>>>> -997,6
>>>>> +1016,17 @@ bool BranchFolder::TailMergeBlocks(Machi
>>>>>            if (PBB->hasEHPadSuccessor())
>>>>>              continue;
>>>>>
>>>>> +      // Bail out if the loop header (IBB) is not the top of the
>>>>> loop chain
>>>>> +      // after the block placement.  Otherwise, the common tail of
>>>>> IBB's
>>>>> +      // predecessors may become the loop top if block placement
>>>>> + is
>>>>> called again
>>>>> +      // and the predecessors may branch to this common tail.
>>>>> +      // FIXME: Relaxed this check if the algorithm of finding
>>>>> + loop
>>>>> top is
>>>>> +      // changed in MBP.
>>>>> +      if (AfterBlockPlacement && MLI)
>>>>> +        if (MachineLoop *ML = MLI->getLoopFor(IBB))
>>>>> +          if (IBB == ML->getHeader() && ML == MLI->getLoopFor(PBB))
>>>>> +            continue;
>>>>> +
>>>>>            MachineBasicBlock *TBB = nullptr, *FBB = nullptr;
>>>>>            SmallVector<MachineOperand, 4> Cond;
>>>>>            if (!TII->AnalyzeBranch(*PBB, TBB, FBB, Cond, true)) {
>>>>>
>>>>> Modified: llvm/trunk/lib/CodeGen/BranchFolding.h
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFo
>>>>> l di ng.h?rev=271925&r1=271924&r2=271925&view=diff
>>>>> ===================================================================
>>>>> =
>>>>> ==
>>>>> ========
>>>>> --- llvm/trunk/lib/CodeGen/BranchFolding.h (original)
>>>>> +++ llvm/trunk/lib/CodeGen/BranchFolding.h Mon Jun  6 13:36:07 2016
>>>>> @@ -20,6 +20,7 @@ namespace llvm {
>>>>>        class MachineBranchProbabilityInfo;
>>>>>        class MachineFunction;
>>>>>        class MachineModuleInfo;
>>>>> +  class MachineLoopInfo;
>>>>>        class RegScavenger;
>>>>>        class TargetInstrInfo;
>>>>>        class TargetRegisterInfo;
>>>>> @@ -32,10 +33,11 @@ namespace llvm {
>>>>>                                MBFIWrapper &MBFI,
>>>>>                                const MachineBranchProbabilityInfo
>>>>> &MBPI);
>>>>>
>>>>> -    bool OptimizeFunction(MachineFunction &MF,
>>>>> -                          const TargetInstrInfo *tii,
>>>>> -                          const TargetRegisterInfo *tri,
>>>>> -                          MachineModuleInfo *mmi);
>>>>> +    bool OptimizeFunction(MachineFunction &MF, const
>>>>> + TargetInstrInfo
>>>>> *tii,
>>>>> +                          const TargetRegisterInfo *tri,
>>>>> MachineModuleInfo *mmi,
>>>>> +                          MachineLoopInfo *mli = nullptr,
>>>>> +                          bool AfterPlacement = false);
>>>>> +
>>>>>        private:
>>>>>          class MergePotentialsElt {
>>>>>            unsigned Hash;
>>>>> @@ -93,11 +95,13 @@ namespace llvm {
>>>>>          };
>>>>>          std::vector<SameTailElt> SameTails;
>>>>>
>>>>> +    bool AfterBlockPlacement;
>>>>>          bool EnableTailMerge;
>>>>>          bool EnableHoistCommonCode;
>>>>>          const TargetInstrInfo *TII;
>>>>>          const TargetRegisterInfo *TRI;
>>>>>          MachineModuleInfo *MMI;
>>>>> +    MachineLoopInfo *MLI;
>>>>>          RegScavenger *RS;
>>>>>
>>>>>        public:
>>>>>
>>>>> Modified: llvm/trunk/lib/CodeGen/MachineBlockPlacement.cpp
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineB
>>>>> l oc kPlacement.cpp?rev=271925&r1=271924&r2=271925&view=diff
>>>>> ===================================================================
>>>>> =
>>>>> ==
>>>>> ========
>>>>> --- llvm/trunk/lib/CodeGen/MachineBlockPlacement.cpp (original)
>>>>> +++ llvm/trunk/lib/CodeGen/MachineBlockPlacement.cpp Mon Jun  6
>>>>> +++ 13:36:07 2016
>>>>> @@ -26,6 +26,8 @@
>>>>>
>>>>> //===--------------------------------------------------------------
>>>>> -
>>>>> --
>>>>> -----===//
>>>>>
>>>>>      #include "llvm/CodeGen/Passes.h"
>>>>> +#include "llvm/CodeGen/TargetPassConfig.h"
>>>>> +#include "BranchFolding.h"
>>>>>      #include "llvm/ADT/DenseMap.h"
>>>>>      #include "llvm/ADT/SmallPtrSet.h"
>>>>>      #include "llvm/ADT/SmallVector.h"
>>>>> @@ -116,6 +118,12 @@ static cl::opt<unsigned> JumpInstCost("j
>>>>>                                            cl::desc("Cost of jump
>>>>> instructions."),
>>>>>                                            cl::init(1), cl::Hidden);
>>>>>
>>>>> +static cl::opt<bool>
>>>>> +BranchFoldPlacement("branch-fold-placement",
>>>>> +              cl::desc("Perform branch folding during placement. "
>>>>> +                       "Reduces code size."),
>>>>> +              cl::init(true), cl::Hidden);
>>>>> +
>>>>>      extern cl::opt<unsigned> StaticLikelyProb;
>>>>>
>>>>>      namespace {
>>>>> @@ -232,10 +240,10 @@ class MachineBlockPlacement : public Mac
>>>>>        const MachineBranchProbabilityInfo *MBPI;
>>>>>
>>>>>        /// \brief A handle to the function-wide block frequency pass.
>>>>> -  const MachineBlockFrequencyInfo *MBFI;
>>>>> +  std::unique_ptr<BranchFolder::MBFIWrapper> MBFI;
>>>>>
>>>>>        /// \brief A handle to the loop info.
>>>>> -  const MachineLoopInfo *MLI;
>>>>> +  MachineLoopInfo *MLI;
>>>>>
>>>>>        /// \brief A handle to the target's instruction info.
>>>>>        const TargetInstrInfo *TII;
>>>>> @@ -323,6 +331,7 @@ public:
>>>>>          AU.addRequired<MachineBlockFrequencyInfo>();
>>>>>          AU.addRequired<MachineDominatorTree>();
>>>>>          AU.addRequired<MachineLoopInfo>();
>>>>> +    AU.addRequired<TargetPassConfig>();
>>>>>          MachineFunctionPass::getAnalysisUsage(AU);
>>>>>        }
>>>>>      };
>>>>> @@ -1462,7 +1471,8 @@ bool MachineBlockPlacement::runOnMachine
>>>>>          return false;
>>>>>
>>>>>        MBPI = &getAnalysis<MachineBranchProbabilityInfo>();
>>>>> -  MBFI = &getAnalysis<MachineBlockFrequencyInfo>();
>>>>> +  MBFI = llvm::make_unique<BranchFolder::MBFIWrapper>(
>>>>> +      getAnalysis<MachineBlockFrequencyInfo>());
>>>>>        MLI = &getAnalysis<MachineLoopInfo>();
>>>>>        TII = F.getSubtarget().getInstrInfo();
>>>>>        TLI = F.getSubtarget().getTargetLowering();
>>>>> @@ -1470,6 +1480,29 @@ bool MachineBlockPlacement::runOnMachine
>>>>>        assert(BlockToChain.empty());
>>>>>
>>>>>        buildCFGChains(F);
>>>>> +
>>>>> +  // Changing the layout can create new tail merging opportunities.
>>>>> +  TargetPassConfig *PassConfig = &getAnalysis<TargetPassConfig>();
>>>>> + // TailMerge can create jump into if branches that make CFG
>>>>> + irreducible for  // HW that requires structurized CFG.
>>>>> +  bool EnableTailMerge = !F.getTarget().requiresStructuredCFG() &&
>>>>> +                         PassConfig->getEnableTailMerge() &&
>>>>> +                         BranchFoldPlacement;  // No tail merging
>>>>> + opportunities if the block number is less than four.
>>>>> +  if (F.size() > 3 && EnableTailMerge) {
>>>>> +    BranchFolder BF(/*EnableTailMerge=*/true,
>>>>> + /*CommonHoist=*/false,
>>>>> *MBFI,
>>>>> +                    *MBPI);
>>>>> +
>>>>> +    if (BF.OptimizeFunction(F, TII,
>>>>> + F.getSubtarget().getRegisterInfo(),
>>>>> +
>>>>> getAnalysisIfAvailable<MachineModuleInfo>(), MLI,
>>>>> +                            /*AfterBlockPlacement=*/true)) {
>>>>> +      // Redo the layout if tail merging creates/removes/moves blocks.
>>>>> +      BlockToChain.clear();
>>>>> +      ChainAllocator.DestroyAll();
>>>>> +      buildCFGChains(F);
>>>>> +    }
>>>>> +  }
>>>>> +
>>>>>        optimizeBranches(F);
>>>>>        alignBlocks(F);
>>>>>
>>>>>
>>>>> Added: llvm/trunk/test/CodeGen/AArch64/tailmerging_in_mbp.ll
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64
>>>>> / ta ilmerging_in_mbp.ll?rev=271925&view=auto
>>>>> ===================================================================
>>>>> =
>>>>> ==
>>>>> ========
>>>>> --- llvm/trunk/test/CodeGen/AArch64/tailmerging_in_mbp.ll (added)
>>>>> +++ llvm/trunk/test/CodeGen/AArch64/tailmerging_in_mbp.ll Mon Jun
>>>>> +++ 6
>>>>> +++ 13:36:07 2016
>>>>> @@ -0,0 +1,63 @@
>>>>> +; RUN: llc <%s -march=aarch64 | FileCheck %s
>>>>> +
>>>>> +; CHECK-LABEL: test:
>>>>> +; CHECK:       .LBB0_7
>>>>> +; CHECK:         b.hi    .LBB0_2
>>>>> +; CHECK-NEXT:    b    .LBB0_9
>>>>> +; CHECK-NEXT:  .LBB0_8
>>>>> +; CHECK-NEXT:    mov     x8, x9
>>>>> +; CHECK-NEXT:  .LBB0_9
>>>>> +define i64 @test(i64 %n, i64* %a, i64* %b, i64* %c, i64* %d, i64*
>>>>> +%e,
>>>>> +i64* %f) {
>>>>> +entry:
>>>>> +  %cmp28 = icmp sgt i64 %n, 1
>>>>> +  br i1 %cmp28, label %for.body, label %for.end
>>>>> +
>>>>> +for.body:                                         ; preds =
>>>>> %for.body.lr.ph, %if.end
>>>>> +  %j = phi i64 [ %n, %entry ], [ %div, %if.end ]  %div = lshr i64
>>>>> + %j, 1  %a.arrayidx = getelementptr inbounds i64, i64* %a, i64
>>>>> + %div %a.j = load i64, i64* %a.arrayidx  %b.arrayidx =
>>>>> + getelementptr inbounds i64, i64* %b, i64 %div  %b.j = load i64,
>>>>> + i64* %b.arrayidx %cmp.i = icmp slt i64 %a.j, %b.j  br i1 %cmp.i,
>>>>> + label %for.end.loopexit, label %cond.false.i
>>>>> +
>>>>> +cond.false.i:                                     ; preds = %for.body
>>>>> +  %cmp4.i = icmp sgt i64 %a.j, %b.j
>>>>> +  br i1 %cmp4.i, label %if.end, label %cond.false6.i
>>>>> +
>>>>> +cond.false6.i:                                    ; preds =
>>>>> %cond.false.i
>>>>> +  %c.arrayidx = getelementptr inbounds i64, i64* %c, i64 %div
>>>>> + %c.j = load i64, i64* %c.arrayidx  %d.arrayidx = getelementptr
>>>>> + inbounds i64, i64* %d, i64 %div  %d.j = load i64, i64*
>>>>> + %d.arrayidx  %cmp9.i = icmp slt i64 %c.j, %d.j  br i1 %cmp9.i,
>>>>> + label %for.end.loopexit, label %cond.false11.i
>>>>> +
>>>>> +cond.false11.i:                                   ; preds =
>>>>> %cond.false6.i
>>>>> +  %cmp14.i = icmp sgt i64 %c.j, %d.j  br i1 %cmp14.i, label
>>>>> + %if.end, label %cond.false12.i
>>>>> +
>>>>> +cond.false12.i:                           ; preds = %cond.false11.i
>>>>> +  %e.arrayidx = getelementptr inbounds i64, i64* %e, i64 %div
>>>>> +  %e.j = load i64, i64* %e.arrayidx
>>>>> +  %f.arrayidx = getelementptr inbounds i64, i64* %f, i64 %div
>>>>> +  %f.j = load i64, i64* %f.arrayidx
>>>>> +  %cmp19.i = icmp sgt i64 %e.j, %f.j
>>>>> +  br i1 %cmp19.i, label %if.end, label %for.end.loopexit
>>>>> +
>>>>> +if.end:                                           ; preds =
>>>>> %cond.false12.i, %cond.false11.i, %cond.false.i
>>>>> +  %cmp = icmp ugt i64 %j, 3
>>>>> +  br i1 %cmp, label %for.body, label %for.end.loopexit
>>>>> +
>>>>> +for.end.loopexit:                                 ; preds =
>>>>> %cond.false12.i, %cond.false6.i, %for.body, %if.end
>>>>> +  %j.0.lcssa.ph = phi i64 [ %j, %cond.false12.i ], [ %j,
>>>>> +%cond.false6.i ], [ %j, %for.body ], [ %div, %if.end ]
>>>>> +  br label %for.end
>>>>> +
>>>>> +for.end:                                          ; preds =
>>>>> %for.end.loopexit, %entry
>>>>> +  %j.0.lcssa = phi i64 [ %n, %entry ], [ %j.0.lcssa.ph,
>>>>> +%for.end.loopexit ]
>>>>> +  %j.2 = add i64 %j.0.lcssa, %n
>>>>> +  %j.3 = mul i64 %j.2, %n
>>>>> +  %j.4 = add i64 %j.3, 10
>>>>> +  ret i64 %j.4
>>>>> +}
>>>>>
>>>>> Modified: llvm/trunk/test/CodeGen/ARM/arm-and-tst-peephole.ll
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/arm
>>>>> - an d-tst-peephole.ll?rev=271925&r1=271924&r2=271925&view=diff
>>>>> ===================================================================
>>>>> =
>>>>> ==
>>>>> ========
>>>>> --- llvm/trunk/test/CodeGen/ARM/arm-and-tst-peephole.ll (original)
>>>>> +++ llvm/trunk/test/CodeGen/ARM/arm-and-tst-peephole.ll Mon Jun  6
>>>>> +++ 13:36:07 2016
>>>>> @@ -49,7 +49,7 @@ tailrecurse.switch:
>>>>>      ; V8-NEXT: beq
>>>>>      ; V8-NEXT: %tailrecurse.switch
>>>>>      ; V8: cmp
>>>>> -; V8-NEXT: bne
>>>>> +; V8-NEXT: beq
>>>>>      ; V8-NEXT: b
>>>>>      ; The trailing space in the last line checks that the branch is
>>>>> unconditional
>>>>>        switch i32 %and, label %sw.epilog [
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>
>>>>
>>
>>
>>
>
>



More information about the llvm-commits mailing list