[PATCH] D13791: In machine block placement, use 60% instead of 80% as the hot branch probability threshold when profile data is available.

Cong Hou via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 15 15:50:11 PDT 2015


congh created this revision.
congh added reviewers: djasper, davidxl.
congh added a subscriber: llvm-commits.

Currently the machine block placement pass in LLVM uses 80% as the hot branch probability. However, I found that when profile data is available, we can decrease this threshold to get better performance. I modified this threshold into 50%/60%/70%, and use them to measure frontend stalls for SPEC2000 benchmarks. The average number of stalls generally decreases as we use smaller threshold. Below is my experiment results:

Hot branch prob threshold    |  # of frontend stalls change comparing to when using 80%
50%  |  +0.6%
60%  |  -2.2%
70%  |  -1.8%

Here we can see by using 60% we can get the smallest number of frontend stalls for SPEC2000. Hence in this patch I use 60% as the threshold when profile data is available.

http://reviews.llvm.org/D13791

Files:
  lib/CodeGen/MachineBlockPlacement.cpp
  test/CodeGen/AArch64/fast-isel-branch-cond-split.ll

Index: test/CodeGen/AArch64/fast-isel-branch-cond-split.ll
===================================================================
--- test/CodeGen/AArch64/fast-isel-branch-cond-split.ll
+++ test/CodeGen/AArch64/fast-isel-branch-cond-split.ll
@@ -19,8 +19,8 @@
 }
 
 ; CHECK-LABEL: test_and
-; CHECK:       cbz w0, {{LBB[0-9]+_2}}
-; CHECK:       cbnz w1, {{LBB[0-9]+_3}}
+; CHECK:       cbnz w0, {{LBB[0-9]+_2}}
+; CHECK:       cbz w1, {{LBB[0-9]+_1}}
 define i64 @test_and(i32 %a, i32 %b) {
 bb1:
   %0 = icmp ne i32 %a, 0
Index: lib/CodeGen/MachineBlockPlacement.cpp
===================================================================
--- lib/CodeGen/MachineBlockPlacement.cpp
+++ lib/CodeGen/MachineBlockPlacement.cpp
@@ -225,6 +225,9 @@
   /// between basic blocks.
   DenseMap<MachineBasicBlock *, BlockChain *> BlockToChain;
 
+  /// \brief A flag indicating if the function being processes has profile data.
+  bool HasProfileData = false;
+
   void markChainSuccessors(BlockChain &Chain, MachineBasicBlock *LoopHeaderBB,
                            SmallVectorImpl<MachineBasicBlock *> &BlockWorkList,
                            const BlockFilterSet *BlockFilter = nullptr);
@@ -351,7 +354,9 @@
 MachineBlockPlacement::selectBestSuccessor(MachineBasicBlock *BB,
                                            BlockChain &Chain,
                                            const BlockFilterSet *BlockFilter) {
-  const BranchProbability HotProb(4, 5); // 80%
+  // Use 60% if profile data is available, otherwise 80%.
+  const BranchProbability HotProb =
+      HasProfileData ? BranchProbability(3, 5) : BranchProbability(4, 5);
 
   MachineBasicBlock *BestSucc = nullptr;
   // FIXME: Due to the performance of the probability and weight routines in
@@ -413,10 +418,14 @@
         continue;
       }
 
-      // Make sure that a hot successor doesn't have a globally more
-      // important predecessor.
+      // Make sure that a hot successor doesn't have a globally more important
+      // predecessor.
+      assert(HotProb.getDenominator() == HotProb.getCompl().getDenominator());
+      assert(HotProb > BranchProbability(1, 2));
+      const BranchProbability ColdToHotRatio(HotProb.getCompl().getNumerator(),
+                                             HotProb.getNumerator());
       BlockFrequency CandidateEdgeFreq =
-          MBFI->getBlockFreq(BB) * SuccProb * HotProb.getCompl();
+          MBFI->getBlockFreq(BB) * SuccProb * ColdToHotRatio;
       bool BadCFGConflict = false;
       for (MachineBasicBlock *Pred : Succ->predecessors()) {
         if (Pred == Succ || (BlockFilter && !BlockFilter->count(Pred)) ||
@@ -1140,6 +1149,7 @@
   TII = F.getSubtarget().getInstrInfo();
   TLI = F.getSubtarget().getTargetLowering();
   MDT = &getAnalysis<MachineDominatorTree>();
+  HasProfileData = static_cast<bool>(F.getFunction()->getEntryCount());
   assert(BlockToChain.empty());
 
   buildCFGChains(F);


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D13791.37530.patch
Type: text/x-patch
Size: 2927 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151015/995ecea8/attachment.bin>


More information about the llvm-commits mailing list