[llvm] r271925 - [MBP] Reduce code size by running tail merging in MBP.

Mikael Holmén via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 14 09:56:23 PDT 2016


Hi,

Karl-Johan got it right.

TBB is BB#3, FBB is nullptr, and Cond is the condition set on the 
brr_cond instruction ending BB#2.

And this makes updateTerminator pass everything until

   // The block has a fallthrough conditional branch.
   if (isLayoutSuccessor(TBB)) {
     if (TII->ReverseBranchCondition(Cond)) {
       // We can't reverse the condition, add an unconditional branch.
       Cond.clear();
       TII->InsertBranch(*this, FallthroughBB, nullptr, Cond, DL);
       return;
     }
     TII->RemoveBranch(*this);
     TII->InsertBranch(*this, FallthroughBB, nullptr, Cond, DL);
   } else if (!isLayoutSuccessor(FallthroughBB)) {
     TII->RemoveBranch(*this);
     TII->InsertBranch(*this, TBB, FallthroughBB, Cond, DL);
   }

where we enter the else branch (FallthroughBB is nullptr), remove the 
conditiopnal branch, but then insert a new (identical) conditional branch.

Regards,
Mikael

On 06/14/2016 06:08 PM, Karl-Johan Karlsson wrote:
> Hi!
>
> Mikael went home from work before your mail arrived. He will be back
> tomorrow. I hope I have recreated the problem as Mikael described it
> below. From my debugger session I printed the following information
> about TBB, FBB and Cond before updateTerminator() is called on BB#2.
>
> (gdb) p PrevBB->dump()
> $39 = void
> (gdb)
> BB#2: derived from LLVM BB %.split.us.split.us
>      Live Ins: %a0_32 %af1 %CCReg %af1
>      Predecessors according to CFG: BB#1
>      brr_cond <BB#3>, pred:2, pred:%af1, pred:0
>      Successors according to CFG: BB#3(0x80000000 / 0x80000000 = 100.00%)
> (gdb) P TBB
> $40 = (llvm::MachineBasicBlock *) 0x28e02d8
> (gdb) P TBB->dump()
> $41 = void
> (gdb)
> BB#3: derived from LLVM BB %bb1.us.us.us
>      Live Ins: %a0_32 %af1
>      Predecessors according to CFG: BB#2 BB#10
>      %a1_40<def,dead> = negh_a16_a32 %a0l, pred:0, pred:%noreg, pred:0,
> %CCReg<imp-def>, %cuc<imp-use>, %af1<imp-def>
>      brr_cond <BB#5>, pred:6, pred:%af1, pred:0
>      Successors according to CFG: BB#5(0x30000000 / 0x80000000 = 37.50%)
> BB#4(0x50000000 / 0x80000000 = 62.50%)
> (gdb) P FBB
> $42 = (llvm::MachineBasicBlock *) 0x0
> (gdb)
> (gdb) P Cond.empty()
> $44 = false
> (gdb)
>
> If I have got any information wrong Mikael will correct me with a mail
> tomorrow.
>
> Regards
> / Karl-Johan Karlsson
>
>
> On 2016-06-14 17:07, Haicheng Wu wrote:
>> Hi Mikael,
>>
>>  From your dumped information, I think the problem is either in
>> AnalyzeBranch() or updateTerminator().  What is the TBB, FBB, and Cond
>> returned by AnalyzeBranch() when it is called on BB#2 before calling
>> updateTerminator()?
>>
>> Best,
>>
>> Haicheng
>>
>> -----Original Message-----
>> From: Mikael Holmén [mailto:mikael.holmen at ericsson.com]
>> Sent: Tuesday, June 14, 2016 7:54 AM
>> To: Haicheng Wu <haicheng at codeaurora.org>
>> Cc: llvm-commits <llvm-commits at lists.llvm.org>; Karl-Johan Karlsson
>> <karl-johan.karlsson at ericsson.com>
>> Subject: Re: [llvm] r271925 - [MBP] Reduce code size by running tail
>> merging in MBP.
>>
>> Hi,
>>
>> We're seeing a problem on our out-of-tree backend with this patch
>> applied. I think the problem is either in
>> MachineBasicBlock::updateTerminator() or how it's used but I'm not
>> sure who's responsibility it is to fix the situation.
>>
>> I've tried to reproduce it on in-tree backends but I've failed on X86
>> and Arm :/
>>
>> Anyway, the fault we see is that after MachineBlockPlacement we get
>> two BBs like this:
>>
>> BB#2: derived from LLVM BB %.split.us.split.us
>>       Live Ins: %a0_32 %af1 %CCReg %af1
>>       Predecessors according to CFG: BB#1
>>           brr_cond <BB#3>, pred:2, pred:%af1, pred:0
>>       Successors according to CFG: BB#3(0x80000000 / 0x80000000 =
>> 100.00%)
>>
>> BB#6: derived from LLVM BB %.split
>>       Live Ins: %CCReg %af1
>>       Predecessors according to CFG: BB#0
>>           brr_cond <BB#13>, pred:3, pred:%af1, pred:0
>>       Successors according to CFG: BB#7(0x40000000 / 0x80000000 = 50.00%)
>> BB#13(0x40000000 / 0x80000000 = 50.00%)
>>
>> So, looking at the instructions of BB#2 it looks like it does
>> fallthrough to BB#6 (only one conditional branch to BB#3) but the CFG
>> says only BB#3 is successor.
>>
>> And the verifier complains about this:
>>
>> *** Bad machine code: MBB exits via conditional branch/fall-through
>> but only has one CFG successor! ***
>> - function:    f2
>> - basic block: BB#2 .split.us.split.us (0x492c1f8) LLVM ERROR: Found 1
>> machine code errors.
>>
>> At some point during MachineBlockPlacement we have
>>
>> BB#2: derived from LLVM BB %.split.us.split.us
>>       Live Ins: %a0_32 %af1 %CCReg %af1
>>       Predecessors according to CFG: BB#1
>>     brr_cond <BB#3>, pred:2, pred:%af1, pred:0
>>       Successors according to CFG: BB#3(0x80000000 / 0x80000000 =
>> 100.00%)
>>
>> BB#3: derived from LLVM BB %bb1.us.us.us
>>       Live Ins: %a0_32 %af1
>>       Predecessors according to CFG: BB#2 BB#10
>>     %a1_40<def,dead> = negh_a16_a32 %a0l, pred:0, pred:%noreg, pred:0,
>> %CCReg<imp-def>, %cuc<imp-use>, %af1<imp-def>
>>     brr_cond <BB#5>, pred:6, pred:%af1, pred:0
>>       Successors according to CFG: BB#5(0x30000000 / 0x80000000 = 37.50%)
>> BB#4(0x50000000 / 0x80000000 = 62.50%)
>>
>> Note that BB#2 both does a conditional jump AND falltrhoughs to BB#3.
>>
>> Then it's decided by MachineBlockPlacement::buildChain that BB#6 is a
>> better layout successor to BB#2 than BB#3, and we get:
>>
>> BB#2: derived from LLVM BB %.split.us.split.us
>>       Live Ins: %a0_32 %af1 %CCReg %af1
>>       Predecessors according to CFG: BB#1
>>     brr_cond <BB#3>, pred:2, pred:%af1, pred:0
>>       Successors according to CFG: BB#3(0x80000000 / 0x80000000 =
>> 100.00%)
>>
>> BB#6: derived from LLVM BB %.split
>>       Live Ins: %CCReg %af1
>>       Predecessors according to CFG: BB#0
>>     brr_cond <BB#13>, pred:3, pred:%af1, pred:0
>>       Successors according to CFG: BB#7(0x40000000 / 0x80000000 = 50.00%)
>> BB#13(0x40000000 / 0x80000000 = 50.00%)
>>
>> Then in MachineBlockPlacement::buildCFGChains we have
>>
>>       if (!TII->AnalyzeBranch(*PrevBB, TBB, FBB, Cond)) {
>>         PrevBB->updateTerminator();
>>
>> which is called on BB#2, and from what I understand the point is that
>> updateTerminator should insert an unconditional branch from BB#2 to
>> BB#3 now that there is no fallthrough anymore, but it doesn't and the
>> code is left broken.
>>
>> What's your thought's on this Haicheng, who should fix BB#2 in this case?
>>
>> Thanks,
>> Mikael
>>
>>
>> On 06/06/2016 08:36 PM, Haicheng Wu via llvm-commits wrote:
>>> Author: haicheng
>>> Date: Mon Jun  6 13:36:07 2016
>>> New Revision: 271925
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=271925&view=rev
>>> Log:
>>> [MBP] Reduce code size by running tail merging in MBP.
>>>
>>> The code layout that TailMerging (inside BranchFolding) works on is
>>> not the final layout optimized based on the branch probability.
>>> Generally, after BlockPlacement, many new merging opportunities emerge.
>>>
>>> This patch calls Tail Merging after MBP and calls MBP again if Tail
>>> Merging merges anything.
>>>
>>> Differential Revision: http://reviews.llvm.org/D20276
>>>
>>> Added:
>>>       llvm/trunk/test/CodeGen/AArch64/tailmerging_in_mbp.ll
>>> Modified:
>>>       llvm/trunk/lib/CodeGen/BranchFolding.cpp
>>>       llvm/trunk/lib/CodeGen/BranchFolding.h
>>>       llvm/trunk/lib/CodeGen/MachineBlockPlacement.cpp
>>>       llvm/trunk/test/CodeGen/ARM/arm-and-tst-peephole.ll
>>>
>>> Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFoldi
>>> ng.cpp?rev=271925&r1=271924&r2=271925&view=diff
>>> ======================================================================
>>> ========
>>> --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Mon Jun  6 13:36:07 2016
>>> @@ -27,6 +27,7 @@
>>>    #include "llvm/CodeGen/MachineFunctionPass.h"
>>>    #include "llvm/CodeGen/MachineJumpTableInfo.h"
>>>    #include "llvm/CodeGen/MachineMemOperand.h"
>>> +#include "llvm/CodeGen/MachineLoopInfo.h"
>>>    #include "llvm/CodeGen/MachineModuleInfo.h"
>>>    #include "llvm/CodeGen/MachineRegisterInfo.h"
>>>    #include "llvm/CodeGen/Passes.h"
>>> @@ -137,6 +138,8 @@ void BranchFolder::RemoveDeadBlock(Machi
>>>      // Remove the block.
>>>      MF->erase(MBB);
>>>      FuncletMembership.erase(MBB);
>>> +  if (MLI)
>>> +    MLI->removeBlock(MBB);
>>>    }
>>>
>>>    /// OptimizeImpDefsBlock - If a basic block is just a bunch of
>>> implicit_def @@ -193,18 +196,22 @@ bool
>>> BranchFolder::OptimizeImpDefsBlock(
>>>    }
>>>
>>>    /// OptimizeFunction - Perhaps branch folding, tail merging and
>>> other -/// CFG optimizations on the given function.
>>> +/// CFG optimizations on the given function.  Block placement changes
>>> +the layout /// and may create new tail merging opportunities.
>>>    bool BranchFolder::OptimizeFunction(MachineFunction &MF,
>>>                                        const TargetInstrInfo *tii,
>>>                                        const TargetRegisterInfo *tri,
>>> -                                    MachineModuleInfo *mmi) {
>>> +                                    MachineModuleInfo *mmi,
>>> +                                    MachineLoopInfo *mli, bool
>>> + AfterPlacement) {
>>>      if (!tii) return false;
>>>
>>>      TriedMerging.clear();
>>>
>>> +  AfterBlockPlacement = AfterPlacement;
>>>      TII = tii;
>>>      TRI = tri;
>>>      MMI = mmi;
>>> +  MLI = mli;
>>>      RS = nullptr;
>>>
>>>      // Use a RegScavenger to help update liveness when required.
>>> @@ -230,7 +237,10 @@ bool BranchFolder::OptimizeFunction(Mach
>>>      bool MadeChangeThisIteration = true;
>>>      while (MadeChangeThisIteration) {
>>>        MadeChangeThisIteration    = TailMergeBlocks(MF);
>>> -    MadeChangeThisIteration   |= OptimizeBranches(MF);
>>> +    // No need to clean up if tail merging does not change anything
>>> after the
>>> +    // block placement.
>>> +    if (!AfterBlockPlacement || MadeChangeThisIteration)
>>> +      MadeChangeThisIteration |= OptimizeBranches(MF);
>>>        if (EnableHoistCommonCode)
>>>          MadeChangeThisIteration |= HoistCommonCode(MF);
>>>        MadeChange |= MadeChangeThisIteration; @@ -447,6 +457,11 @@
>>> MachineBasicBlock *BranchFolder::SplitMB
>>>      // Splice the code over.
>>>      NewMBB->splice(NewMBB->end(), &CurMBB, BBI1, CurMBB.end());
>>>
>>> +  // NewMBB belongs to the same loop as CurMBB.
>>> +  if (MLI)
>>> +    if (MachineLoop *ML = MLI->getLoopFor(&CurMBB))
>>> +      ML->addBasicBlockToLoop(NewMBB, MLI->getBase());
>>> +
>>>      // NewMBB inherits CurMBB's block frequency.
>>>      MBBFreqInfo.setBlockFreq(NewMBB,
>>> MBBFreqInfo.getBlockFreq(&CurMBB));
>>>
>>> @@ -934,23 +949,27 @@ bool BranchFolder::TailMergeBlocks(Machi
>>>      if (!EnableTailMerge) return MadeChange;
>>>
>>>      // First find blocks with no successors.
>>> -  MergePotentials.clear();
>>> -  for (MachineBasicBlock &MBB : MF) {
>>> +  // Block placement does not create new tail merging opportunities
>>> + for these  // blocks.
>>> +  if (!AfterBlockPlacement) {
>>> +    MergePotentials.clear();
>>> +    for (MachineBasicBlock &MBB : MF) {
>>> +      if (MergePotentials.size() == TailMergeThreshold)
>>> +        break;
>>> +      if (!TriedMerging.count(&MBB) && MBB.succ_empty())
>>> +
>>> MergePotentials.push_back(MergePotentialsElt(HashEndOfMBB(MBB), &MBB));
>>> +    }
>>> +
>>> +    // If this is a large problem, avoid visiting the same basic blocks
>>> +    // multiple times.
>>>        if (MergePotentials.size() == TailMergeThreshold)
>>> -      break;
>>> -    if (!TriedMerging.count(&MBB) && MBB.succ_empty())
>>> -
>>> MergePotentials.push_back(MergePotentialsElt(HashEndOfMBB(MBB), &MBB));
>>> -  }
>>> +      for (unsigned i = 0, e = MergePotentials.size(); i != e; ++i)
>>> +        TriedMerging.insert(MergePotentials[i].getBlock());
>>>
>>> -  // If this is a large problem, avoid visiting the same basic blocks
>>> -  // multiple times.
>>> -  if (MergePotentials.size() == TailMergeThreshold)
>>> -    for (unsigned i = 0, e = MergePotentials.size(); i != e; ++i)
>>> -      TriedMerging.insert(MergePotentials[i].getBlock());
>>> -
>>> -  // See if we can do any tail merging on those.
>>> -  if (MergePotentials.size() >= 2)
>>> -    MadeChange |= TryTailMergeBlocks(nullptr, nullptr);
>>> +    // See if we can do any tail merging on those.
>>> +    if (MergePotentials.size() >= 2)
>>> +      MadeChange |= TryTailMergeBlocks(nullptr, nullptr);  }
>>>
>>>      // Look at blocks (IBB) with multiple predecessors (PBB).
>>>      // We change each predecessor to a canonical form, by @@ -997,6
>>> +1016,17 @@ bool BranchFolder::TailMergeBlocks(Machi
>>>          if (PBB->hasEHPadSuccessor())
>>>            continue;
>>>
>>> +      // Bail out if the loop header (IBB) is not the top of the
>>> loop chain
>>> +      // after the block placement.  Otherwise, the common tail of
>>> IBB's
>>> +      // predecessors may become the loop top if block placement is
>>> called again
>>> +      // and the predecessors may branch to this common tail.
>>> +      // FIXME: Relaxed this check if the algorithm of finding loop
>>> top is
>>> +      // changed in MBP.
>>> +      if (AfterBlockPlacement && MLI)
>>> +        if (MachineLoop *ML = MLI->getLoopFor(IBB))
>>> +          if (IBB == ML->getHeader() && ML == MLI->getLoopFor(PBB))
>>> +            continue;
>>> +
>>>          MachineBasicBlock *TBB = nullptr, *FBB = nullptr;
>>>          SmallVector<MachineOperand, 4> Cond;
>>>          if (!TII->AnalyzeBranch(*PBB, TBB, FBB, Cond, true)) {
>>>
>>> Modified: llvm/trunk/lib/CodeGen/BranchFolding.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFoldi
>>> ng.h?rev=271925&r1=271924&r2=271925&view=diff
>>> ======================================================================
>>> ========
>>> --- llvm/trunk/lib/CodeGen/BranchFolding.h (original)
>>> +++ llvm/trunk/lib/CodeGen/BranchFolding.h Mon Jun  6 13:36:07 2016
>>> @@ -20,6 +20,7 @@ namespace llvm {
>>>      class MachineBranchProbabilityInfo;
>>>      class MachineFunction;
>>>      class MachineModuleInfo;
>>> +  class MachineLoopInfo;
>>>      class RegScavenger;
>>>      class TargetInstrInfo;
>>>      class TargetRegisterInfo;
>>> @@ -32,10 +33,11 @@ namespace llvm {
>>>                              MBFIWrapper &MBFI,
>>>                              const MachineBranchProbabilityInfo &MBPI);
>>>
>>> -    bool OptimizeFunction(MachineFunction &MF,
>>> -                          const TargetInstrInfo *tii,
>>> -                          const TargetRegisterInfo *tri,
>>> -                          MachineModuleInfo *mmi);
>>> +    bool OptimizeFunction(MachineFunction &MF, const TargetInstrInfo
>>> *tii,
>>> +                          const TargetRegisterInfo *tri,
>>> MachineModuleInfo *mmi,
>>> +                          MachineLoopInfo *mli = nullptr,
>>> +                          bool AfterPlacement = false);
>>> +
>>>      private:
>>>        class MergePotentialsElt {
>>>          unsigned Hash;
>>> @@ -93,11 +95,13 @@ namespace llvm {
>>>        };
>>>        std::vector<SameTailElt> SameTails;
>>>
>>> +    bool AfterBlockPlacement;
>>>        bool EnableTailMerge;
>>>        bool EnableHoistCommonCode;
>>>        const TargetInstrInfo *TII;
>>>        const TargetRegisterInfo *TRI;
>>>        MachineModuleInfo *MMI;
>>> +    MachineLoopInfo *MLI;
>>>        RegScavenger *RS;
>>>
>>>      public:
>>>
>>> Modified: llvm/trunk/lib/CodeGen/MachineBlockPlacement.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineBloc
>>> kPlacement.cpp?rev=271925&r1=271924&r2=271925&view=diff
>>> ======================================================================
>>> ========
>>> --- llvm/trunk/lib/CodeGen/MachineBlockPlacement.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/MachineBlockPlacement.cpp Mon Jun  6
>>> +++ 13:36:07 2016
>>> @@ -26,6 +26,8 @@
>>>
>>> //===-----------------------------------------------------------------
>>> -----===//
>>>
>>>    #include "llvm/CodeGen/Passes.h"
>>> +#include "llvm/CodeGen/TargetPassConfig.h"
>>> +#include "BranchFolding.h"
>>>    #include "llvm/ADT/DenseMap.h"
>>>    #include "llvm/ADT/SmallPtrSet.h"
>>>    #include "llvm/ADT/SmallVector.h"
>>> @@ -116,6 +118,12 @@ static cl::opt<unsigned> JumpInstCost("j
>>>                                          cl::desc("Cost of jump
>>> instructions."),
>>>                                          cl::init(1), cl::Hidden);
>>>
>>> +static cl::opt<bool>
>>> +BranchFoldPlacement("branch-fold-placement",
>>> +              cl::desc("Perform branch folding during placement. "
>>> +                       "Reduces code size."),
>>> +              cl::init(true), cl::Hidden);
>>> +
>>>    extern cl::opt<unsigned> StaticLikelyProb;
>>>
>>>    namespace {
>>> @@ -232,10 +240,10 @@ class MachineBlockPlacement : public Mac
>>>      const MachineBranchProbabilityInfo *MBPI;
>>>
>>>      /// \brief A handle to the function-wide block frequency pass.
>>> -  const MachineBlockFrequencyInfo *MBFI;
>>> +  std::unique_ptr<BranchFolder::MBFIWrapper> MBFI;
>>>
>>>      /// \brief A handle to the loop info.
>>> -  const MachineLoopInfo *MLI;
>>> +  MachineLoopInfo *MLI;
>>>
>>>      /// \brief A handle to the target's instruction info.
>>>      const TargetInstrInfo *TII;
>>> @@ -323,6 +331,7 @@ public:
>>>        AU.addRequired<MachineBlockFrequencyInfo>();
>>>        AU.addRequired<MachineDominatorTree>();
>>>        AU.addRequired<MachineLoopInfo>();
>>> +    AU.addRequired<TargetPassConfig>();
>>>        MachineFunctionPass::getAnalysisUsage(AU);
>>>      }
>>>    };
>>> @@ -1462,7 +1471,8 @@ bool MachineBlockPlacement::runOnMachine
>>>        return false;
>>>
>>>      MBPI = &getAnalysis<MachineBranchProbabilityInfo>();
>>> -  MBFI = &getAnalysis<MachineBlockFrequencyInfo>();
>>> +  MBFI = llvm::make_unique<BranchFolder::MBFIWrapper>(
>>> +      getAnalysis<MachineBlockFrequencyInfo>());
>>>      MLI = &getAnalysis<MachineLoopInfo>();
>>>      TII = F.getSubtarget().getInstrInfo();
>>>      TLI = F.getSubtarget().getTargetLowering();
>>> @@ -1470,6 +1480,29 @@ bool MachineBlockPlacement::runOnMachine
>>>      assert(BlockToChain.empty());
>>>
>>>      buildCFGChains(F);
>>> +
>>> +  // Changing the layout can create new tail merging opportunities.
>>> +  TargetPassConfig *PassConfig = &getAnalysis<TargetPassConfig>();
>>> + // TailMerge can create jump into if branches that make CFG
>>> + irreducible for  // HW that requires structurized CFG.
>>> +  bool EnableTailMerge = !F.getTarget().requiresStructuredCFG() &&
>>> +                         PassConfig->getEnableTailMerge() &&
>>> +                         BranchFoldPlacement;  // No tail merging
>>> + opportunities if the block number is less than four.
>>> +  if (F.size() > 3 && EnableTailMerge) {
>>> +    BranchFolder BF(/*EnableTailMerge=*/true, /*CommonHoist=*/false,
>>> *MBFI,
>>> +                    *MBPI);
>>> +
>>> +    if (BF.OptimizeFunction(F, TII, F.getSubtarget().getRegisterInfo(),
>>> +
>>> getAnalysisIfAvailable<MachineModuleInfo>(), MLI,
>>> +                            /*AfterBlockPlacement=*/true)) {
>>> +      // Redo the layout if tail merging creates/removes/moves blocks.
>>> +      BlockToChain.clear();
>>> +      ChainAllocator.DestroyAll();
>>> +      buildCFGChains(F);
>>> +    }
>>> +  }
>>> +
>>>      optimizeBranches(F);
>>>      alignBlocks(F);
>>>
>>>
>>> Added: llvm/trunk/test/CodeGen/AArch64/tailmerging_in_mbp.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/ta
>>> ilmerging_in_mbp.ll?rev=271925&view=auto
>>> ======================================================================
>>> ========
>>> --- llvm/trunk/test/CodeGen/AArch64/tailmerging_in_mbp.ll (added)
>>> +++ llvm/trunk/test/CodeGen/AArch64/tailmerging_in_mbp.ll Mon Jun  6
>>> +++ 13:36:07 2016
>>> @@ -0,0 +1,63 @@
>>> +; RUN: llc <%s -march=aarch64 | FileCheck %s
>>> +
>>> +; CHECK-LABEL: test:
>>> +; CHECK:       .LBB0_7
>>> +; CHECK:         b.hi    .LBB0_2
>>> +; CHECK-NEXT:    b    .LBB0_9
>>> +; CHECK-NEXT:  .LBB0_8
>>> +; CHECK-NEXT:    mov     x8, x9
>>> +; CHECK-NEXT:  .LBB0_9
>>> +define i64 @test(i64 %n, i64* %a, i64* %b, i64* %c, i64* %d, i64* %e,
>>> +i64* %f) {
>>> +entry:
>>> +  %cmp28 = icmp sgt i64 %n, 1
>>> +  br i1 %cmp28, label %for.body, label %for.end
>>> +
>>> +for.body:                                         ; preds =
>>> %for.body.lr.ph, %if.end
>>> +  %j = phi i64 [ %n, %entry ], [ %div, %if.end ]
>>> +  %div = lshr i64 %j, 1
>>> +  %a.arrayidx = getelementptr inbounds i64, i64* %a, i64 %div
>>> +  %a.j = load i64, i64* %a.arrayidx
>>> +  %b.arrayidx = getelementptr inbounds i64, i64* %b, i64 %div
>>> +  %b.j = load i64, i64* %b.arrayidx
>>> +  %cmp.i = icmp slt i64 %a.j, %b.j
>>> +  br i1 %cmp.i, label %for.end.loopexit, label %cond.false.i
>>> +
>>> +cond.false.i:                                     ; preds = %for.body
>>> +  %cmp4.i = icmp sgt i64 %a.j, %b.j
>>> +  br i1 %cmp4.i, label %if.end, label %cond.false6.i
>>> +
>>> +cond.false6.i:                                    ; preds =
>>> %cond.false.i
>>> +  %c.arrayidx = getelementptr inbounds i64, i64* %c, i64 %div
>>> +  %c.j = load i64, i64* %c.arrayidx
>>> +  %d.arrayidx = getelementptr inbounds i64, i64* %d, i64 %div
>>> +  %d.j = load i64, i64* %d.arrayidx
>>> +  %cmp9.i = icmp slt i64 %c.j, %d.j
>>> +  br i1 %cmp9.i, label %for.end.loopexit, label %cond.false11.i
>>> +
>>> +cond.false11.i:                                   ; preds =
>>> %cond.false6.i
>>> +  %cmp14.i = icmp sgt i64 %c.j, %d.j
>>> +  br i1 %cmp14.i, label %if.end, label %cond.false12.i
>>> +
>>> +cond.false12.i:                           ; preds = %cond.false11.i
>>> +  %e.arrayidx = getelementptr inbounds i64, i64* %e, i64 %div
>>> +  %e.j = load i64, i64* %e.arrayidx
>>> +  %f.arrayidx = getelementptr inbounds i64, i64* %f, i64 %div
>>> +  %f.j = load i64, i64* %f.arrayidx
>>> +  %cmp19.i = icmp sgt i64 %e.j, %f.j
>>> +  br i1 %cmp19.i, label %if.end, label %for.end.loopexit
>>> +
>>> +if.end:                                           ; preds =
>>> %cond.false12.i, %cond.false11.i, %cond.false.i
>>> +  %cmp = icmp ugt i64 %j, 3
>>> +  br i1 %cmp, label %for.body, label %for.end.loopexit
>>> +
>>> +for.end.loopexit:                                 ; preds =
>>> %cond.false12.i, %cond.false6.i, %for.body, %if.end
>>> +  %j.0.lcssa.ph = phi i64 [ %j, %cond.false12.i ], [ %j,
>>> +%cond.false6.i ], [ %j, %for.body ], [ %div, %if.end ]
>>> +  br label %for.end
>>> +
>>> +for.end:                                          ; preds =
>>> %for.end.loopexit, %entry
>>> +  %j.0.lcssa = phi i64 [ %n, %entry ], [ %j.0.lcssa.ph,
>>> +%for.end.loopexit ]
>>> +  %j.2 = add i64 %j.0.lcssa, %n
>>> +  %j.3 = mul i64 %j.2, %n
>>> +  %j.4 = add i64 %j.3, 10
>>> +  ret i64 %j.4
>>> +}
>>>
>>> Modified: llvm/trunk/test/CodeGen/ARM/arm-and-tst-peephole.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/arm-an
>>> d-tst-peephole.ll?rev=271925&r1=271924&r2=271925&view=diff
>>> ======================================================================
>>> ========
>>> --- llvm/trunk/test/CodeGen/ARM/arm-and-tst-peephole.ll (original)
>>> +++ llvm/trunk/test/CodeGen/ARM/arm-and-tst-peephole.ll Mon Jun  6
>>> +++ 13:36:07 2016
>>> @@ -49,7 +49,7 @@ tailrecurse.switch:
>>>    ; V8-NEXT: beq
>>>    ; V8-NEXT: %tailrecurse.switch
>>>    ; V8: cmp
>>> -; V8-NEXT: bne
>>> +; V8-NEXT: beq
>>>    ; V8-NEXT: b
>>>    ; The trailing space in the last line checks that the branch is
>>> unconditional
>>>      switch i32 %and, label %sw.epilog [
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>



More information about the llvm-commits mailing list