[PATCH] D13791: In machine block placement, use 60% instead of 80% as the hot branch probability threshold when profile data is available.
Cong Hou via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 15 15:50:11 PDT 2015
congh created this revision.
congh added reviewers: djasper, davidxl.
congh added a subscriber: llvm-commits.
Currently the machine block placement pass in LLVM uses 80% as the hot branch probability. However, I found that when profile data is available, we can decrease this threshold to get better performance. I modified this threshold into 50%/60%/70%, and use them to measure frontend stalls for SPEC2000 benchmarks. The average number of stalls generally decreases as we use smaller threshold. Below is my experiment results:
Hot branch prob threshold | # of frontend stalls change comparing to when using 80%
50% | +0.6%
60% | -2.2%
70% | -1.8%
Here we can see by using 60% we can get the smallest number of frontend stalls for SPEC2000. Hence in this patch I use 60% as the threshold when profile data is available.
http://reviews.llvm.org/D13791
Files:
lib/CodeGen/MachineBlockPlacement.cpp
test/CodeGen/AArch64/fast-isel-branch-cond-split.ll
Index: test/CodeGen/AArch64/fast-isel-branch-cond-split.ll
===================================================================
--- test/CodeGen/AArch64/fast-isel-branch-cond-split.ll
+++ test/CodeGen/AArch64/fast-isel-branch-cond-split.ll
@@ -19,8 +19,8 @@
}
; CHECK-LABEL: test_and
-; CHECK: cbz w0, {{LBB[0-9]+_2}}
-; CHECK: cbnz w1, {{LBB[0-9]+_3}}
+; CHECK: cbnz w0, {{LBB[0-9]+_2}}
+; CHECK: cbz w1, {{LBB[0-9]+_1}}
define i64 @test_and(i32 %a, i32 %b) {
bb1:
%0 = icmp ne i32 %a, 0
Index: lib/CodeGen/MachineBlockPlacement.cpp
===================================================================
--- lib/CodeGen/MachineBlockPlacement.cpp
+++ lib/CodeGen/MachineBlockPlacement.cpp
@@ -225,6 +225,9 @@
/// between basic blocks.
DenseMap<MachineBasicBlock *, BlockChain *> BlockToChain;
+ /// \brief A flag indicating if the function being processes has profile data.
+ bool HasProfileData = false;
+
void markChainSuccessors(BlockChain &Chain, MachineBasicBlock *LoopHeaderBB,
SmallVectorImpl<MachineBasicBlock *> &BlockWorkList,
const BlockFilterSet *BlockFilter = nullptr);
@@ -351,7 +354,9 @@
MachineBlockPlacement::selectBestSuccessor(MachineBasicBlock *BB,
BlockChain &Chain,
const BlockFilterSet *BlockFilter) {
- const BranchProbability HotProb(4, 5); // 80%
+ // Use 60% if profile data is available, otherwise 80%.
+ const BranchProbability HotProb =
+ HasProfileData ? BranchProbability(3, 5) : BranchProbability(4, 5);
MachineBasicBlock *BestSucc = nullptr;
// FIXME: Due to the performance of the probability and weight routines in
@@ -413,10 +418,14 @@
continue;
}
- // Make sure that a hot successor doesn't have a globally more
- // important predecessor.
+ // Make sure that a hot successor doesn't have a globally more important
+ // predecessor.
+ assert(HotProb.getDenominator() == HotProb.getCompl().getDenominator());
+ assert(HotProb > BranchProbability(1, 2));
+ const BranchProbability ColdToHotRatio(HotProb.getCompl().getNumerator(),
+ HotProb.getNumerator());
BlockFrequency CandidateEdgeFreq =
- MBFI->getBlockFreq(BB) * SuccProb * HotProb.getCompl();
+ MBFI->getBlockFreq(BB) * SuccProb * ColdToHotRatio;
bool BadCFGConflict = false;
for (MachineBasicBlock *Pred : Succ->predecessors()) {
if (Pred == Succ || (BlockFilter && !BlockFilter->count(Pred)) ||
@@ -1140,6 +1149,7 @@
TII = F.getSubtarget().getInstrInfo();
TLI = F.getSubtarget().getTargetLowering();
MDT = &getAnalysis<MachineDominatorTree>();
+ HasProfileData = static_cast<bool>(F.getFunction()->getEntryCount());
assert(BlockToChain.empty());
buildCFGChains(F);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D13791.37530.patch
Type: text/x-patch
Size: 2927 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151015/995ecea8/attachment.bin>
More information about the llvm-commits
mailing list