<div dir="ltr"><a href="https://reviews.llvm.org/D22892">https://reviews.llvm.org/D22892</a><br><div><br></div><div>David</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 30, 2016 at 7:44 PM, Philip Reames via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Was there a phabricator review for this? I want to catch up on the discussion.<div class="HOEnZb"><div class="h5"><br>
<br>
On 07/29/2016 11:09 AM, Kyle Butt via llvm-commits wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Author: iteratee<br>
Date: Fri Jul 29 13:09:28 2016<br>
New Revision: 277187<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=277187&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject?rev=277187&view=rev</a><br>
Log:<br>
Codegen: MachineBlockPlacement Improve probability layout.<br>
<br>
The following pattern was being layed out poorly:<br>
<br>
A<br>
/ \<br>
B C<br>
/ \ / \<br>
D E ? (Doesn't matter)<br>
<br>
Where A->B is far more likely than A->C, and prob(B->D) = prob(B->E)<br>
<br>
The current algorithm gives:<br>
A,B,C,E (D goes on worklist)<br>
<br>
It does this even if C has a frequency count of 0. This patch<br>
adjusts the layout calculation so that if freq(B->E) >> freq(C->E)<br>
then we go ahead and layout E rather than C. Fallthrough half the time<br>
is better than fallthrough never, or fallthrough very rarely. The<br>
resulting layout is:<br>
<br>
A,B,E, (C and D are in a worklist)<br>
<br>
Modified:<br>
llvm/trunk/lib/CodeGen/Machin<wbr>eBlockPlacement.cpp<br>
llvm/trunk/test/CodeGen/X86/b<wbr>lock-placement.ll<br>
<br>
Modified: llvm/trunk/lib/CodeGen/Machine<wbr>BlockPlacement.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineBlockPlacement.cpp?rev=277187&r1=277186&r2=277187&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/lib/CodeGen/<wbr>MachineBlockPlacement.cpp?rev=<wbr>277187&r1=277186&r2=277187&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/CodeGen/Machine<wbr>BlockPlacement.cpp (original)<br>
+++ llvm/trunk/lib/CodeGen/Machine<wbr>BlockPlacement.cpp Fri Jul 29 13:09:28 2016<br>
@@ -631,18 +631,46 @@ bool MachineBlockPlacement::hasBett<wbr>erLay<br>
// BB->Succ. This is equivalent to looking the CFG backward with backward<br>
// edge: Prob(Succ->BB) needs to >= HotProb in order to be selected (without<br>
// profile data).<br>
-<br>
+ // ------------------------------<wbr>------------------------------<wbr>--------------<br>
+ // Case 3: forked diamond<br>
+ // S<br>
+ // / \<br>
+ // / \<br>
+ // BB Pred<br>
+ // | \ / |<br>
+ // | \ / |<br>
+ // | X |<br>
+ // | / \ |<br>
+ // | / \ |<br>
+ // S1 S2<br>
+ //<br>
+ // The current block is BB and edge BB->S1 is now being evaluated.<br>
+ // As above S->BB was already selected because<br>
+ // prob(S->BB) > prob(S->Pred). Assume that prob(BB->S1) >= prob(BB->S2).<br>
+ //<br>
+ // topo-order:<br>
+ //<br>
+ // S-------| ---S<br>
+ // | | | |<br>
+ // ---BB | | BB<br>
+ // | | | |<br>
+ // | Pred----| | S1----<br>
+ // | | | |<br>
+ // --(S1 or S2) ---Pred--<br>
+ //<br>
+ // topo-cost = freq(S->Pred) + freq(BB->S1) + freq(BB->S2)<br>
+ // + min(freq(Pred->S1), freq(Pred->S2))<br>
+ // Non-topo-order cost:<br>
+ // In the worst case, S2 will not get laid out after Pred.<br>
+ // non-topo-cost = 2 * freq(S->Pred) + freq(BB->S2).<br>
+ // To be conservative, we can assume that min(freq(Pred->S1), freq(Pred->S2))<br>
+ // is 0. Then the non topo layout is better when<br>
+ // freq(S->Pred) < freq(BB->S1).<br>
+ // This is exactly what is checked below.<br>
+ // Note there are other shapes that apply (Pred may not be a single block,<br>
+ // but they all fit this general pattern.)<br>
BranchProbability HotProb = getLayoutSuccessorProbThreshol<wbr>d(BB);<br>
- // Forward checking. For case 2, SuccProb will be 1.<br>
- if (SuccProb < HotProb) {<br>
- DEBUG(dbgs() << " Not a candidate: " << getBlockName(Succ) << " "<br>
- << "Respecting topological ordering because "<br>
- << "probability is less than prob treshold: "<br>
- << SuccProb << "\n");<br>
- return true;<br>
- }<br>
-<br>
// Make sure that a hot successor doesn't have a globally more<br>
// important predecessor.<br>
BlockFrequency CandidateEdgeFreq = MBFI->getBlockFreq(BB) * RealSuccProb;<br>
@@ -653,11 +681,11 @@ bool MachineBlockPlacement::hasBett<wbr>erLay<br>
(BlockFilter && !BlockFilter->count(Pred)) ||<br>
BlockToChain[Pred] == &Chain)<br>
continue;<br>
- // Do backward checking. For case 1, it is actually redundant check. For<br>
- // case 2 above, we need a backward checking to filter out edges that are<br>
- // not 'strongly' biased. With profile data available, the check is mostly<br>
- // redundant too (when threshold prob is set at 50%) unless S has more than<br>
- // two successors.<br>
+ // Do backward checking.<br>
+ // For all cases above, we need a backward checking to filter out edges that<br>
+ // are not 'strongly' biased. With profile data available, the check is<br>
+ // mostly redundant for case 2 (when threshold prob is set at 50%) unless S<br>
+ // has more than two successors.<br>
// BB Pred<br>
// \ /<br>
// Succ<br>
@@ -666,6 +694,8 @@ bool MachineBlockPlacement::hasBett<wbr>erLay<br>
// i.e. freq(BB->Succ) > freq(BB->Succ) * HotProb + freq(Pred->Succ) *<br>
// HotProb<br>
// i.e. freq((BB->Succ) * (1 - HotProb) > freq(Pred->Succ) * HotProb<br>
+ // Case 1 is covered too, because the first equation reduces to:<br>
+ // prob(BB->Succ) > HotProb. (freq(Succ) = freq(BB) for a triangle)<br>
BlockFrequency PredEdgeFreq =<br>
MBFI->getBlockFreq(Pred) * MBPI->getEdgeProbability(Pred, Succ);<br>
if (PredEdgeFreq * HotProb >= CandidateEdgeFreq * HotProb.getCompl()) {<br>
<br>
Modified: llvm/trunk/test/CodeGen/X86/bl<wbr>ock-placement.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/block-placement.ll?rev=277187&r1=277186&r2=277187&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/test/CodeGen/<wbr>X86/block-placement.ll?rev=<wbr>277187&r1=277186&r2=277187&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/test/CodeGen/X86/bl<wbr>ock-placement.ll (original)<br>
+++ llvm/trunk/test/CodeGen/X86/bl<wbr>ock-placement.ll Fri Jul 29 13:09:28 2016<br>
@@ -1283,6 +1283,174 @@ exit:<br>
ret void<br>
}<br>
+declare void @a()<br>
+declare void @b()<br>
+<br>
+define void @test_forked_hot_diamond(i32* %a) {<br>
+; Test that a hot-branch with probability > 80% followed by a 50/50 branch<br>
+; will not place the cold predecessor if the probability for the fallthrough<br>
+; remains above 80%<br>
+; CHECK-LABEL: test_forked_hot_diamond<br>
+; CHECK: %entry<br>
+; CHECK: %then<br>
+; CHECK: %fork1<br>
+; CHECK: %else<br>
+; CHECK: %fork2<br>
+; CHECK: %exit<br>
+entry:<br>
+ %gep1 = getelementptr i32, i32* %a, i32 1<br>
+ %val1 = load i32, i32* %gep1<br>
+ %cond1 = icmp ugt i32 %val1, 1<br>
+ br i1 %cond1, label %then, label %else, !prof !5<br>
+<br>
+then:<br>
+ call void @hot_function()<br>
+ %gep2 = getelementptr i32, i32* %a, i32 2<br>
+ %val2 = load i32, i32* %gep2<br>
+ %cond2 = icmp ugt i32 %val2, 2<br>
+ br i1 %cond2, label %fork1, label %fork2, !prof !8<br>
+<br>
+else:<br>
+ call void @cold_function()<br>
+ %gep3 = getelementptr i32, i32* %a, i32 3<br>
+ %val3 = load i32, i32* %gep3<br>
+ %cond3 = icmp ugt i32 %val3, 3<br>
+ br i1 %cond3, label %fork1, label %fork2, !prof !8<br>
+<br>
+fork1:<br>
+ call void @a()<br>
+ br label %exit<br>
+<br>
+fork2:<br>
+ call void @b()<br>
+ br label %exit<br>
+<br>
+exit:<br>
+ call void @hot_function()<br>
+ ret void<br>
+}<br>
+<br>
+define void @test_forked_hot_diamond_gets_<wbr>cold(i32* %a) {<br>
+; Test that a hot-branch with probability > 80% followed by a 50/50 branch<br>
+; will place the cold predecessor if the probability for the fallthrough<br>
+; falls below 80%<br>
+; The probability for both branches is 85%. For then2 vs else1<br>
+; this results in a compounded probability of 83%.<br>
+; Neither then2->fork1 nor then2->fork2 has a large enough relative<br>
+; probability to break the CFG.<br>
+; Relative probs:<br>
+; then2 -> fork1 vs else1 -> fork1 = 71%<br>
+; then2 -> fork2 vs else2 -> fork2 = 74%<br>
+; CHECK-LABEL: test_forked_hot_diamond_gets_c<wbr>old<br>
+; CHECK: %entry<br>
+; CHECK: %then1<br>
+; CHECK: %then2<br>
+; CHECK: %else1<br>
+; CHECK: %fork1<br>
+; CHECK: %else2<br>
+; CHECK: %fork2<br>
+; CHECK: %exit<br>
+entry:<br>
+ %gep1 = getelementptr i32, i32* %a, i32 1<br>
+ %val1 = load i32, i32* %gep1<br>
+ %cond1 = icmp ugt i32 %val1, 1<br>
+ br i1 %cond1, label %then1, label %else1, !prof !9<br>
+<br>
+then1:<br>
+ call void @hot_function()<br>
+ %gep2 = getelementptr i32, i32* %a, i32 2<br>
+ %val2 = load i32, i32* %gep2<br>
+ %cond2 = icmp ugt i32 %val2, 2<br>
+ br i1 %cond2, label %then2, label %else2, !prof !9<br>
+<br>
+else1:<br>
+ call void @cold_function()<br>
+ br label %fork1<br>
+<br>
+then2:<br>
+ call void @hot_function()<br>
+ %gep3 = getelementptr i32, i32* %a, i32 3<br>
+ %val3 = load i32, i32* %gep2<br>
+ %cond3 = icmp ugt i32 %val2, 3<br>
+ br i1 %cond3, label %fork1, label %fork2, !prof !8<br>
+<br>
+else2:<br>
+ call void @cold_function()<br>
+ br label %fork2<br>
+<br>
+fork1:<br>
+ call void @a()<br>
+ br label %exit<br>
+<br>
+fork2:<br>
+ call void @b()<br>
+ br label %exit<br>
+<br>
+exit:<br>
+ call void @hot_function()<br>
+ ret void<br>
+}<br>
+<br>
+define void @test_forked_hot_diamond_stays<wbr>_hot(i32* %a) {<br>
+; Test that a hot-branch with probability > 88.88% (1:8) followed by a 50/50<br>
+; branch will not place the cold predecessor as the probability for the<br>
+; fallthrough stays above 80%<br>
+; (1:8) followed by (1:1) is still (1:4)<br>
+; Here we use 90% probability because two in a row<br>
+; have a 89 % probability vs the original branch.<br>
+; CHECK-LABEL: test_forked_hot_diamond_stays_<wbr>hot<br>
+; CHECK: %entry<br>
+; CHECK: %then1<br>
+; CHECK: %then2<br>
+; CHECK: %fork1<br>
+; CHECK: %else1<br>
+; CHECK: %else2<br>
+; CHECK: %fork2<br>
+; CHECK: %exit<br>
+entry:<br>
+ %gep1 = getelementptr i32, i32* %a, i32 1<br>
+ %val1 = load i32, i32* %gep1<br>
+ %cond1 = icmp ugt i32 %val1, 1<br>
+ br i1 %cond1, label %then1, label %else1, !prof !10<br>
+<br>
+then1:<br>
+ call void @hot_function()<br>
+ %gep2 = getelementptr i32, i32* %a, i32 2<br>
+ %val2 = load i32, i32* %gep2<br>
+ %cond2 = icmp ugt i32 %val2, 2<br>
+ br i1 %cond2, label %then2, label %else2, !prof !10<br>
+<br>
+else1:<br>
+ call void @cold_function()<br>
+ br label %fork1<br>
+<br>
+then2:<br>
+ call void @hot_function()<br>
+ %gep3 = getelementptr i32, i32* %a, i32 3<br>
+ %val3 = load i32, i32* %gep2<br>
+ %cond3 = icmp ugt i32 %val2, 3<br>
+ br i1 %cond3, label %fork1, label %fork2, !prof !8<br>
+<br>
+else2:<br>
+ call void @cold_function()<br>
+ br label %fork2<br>
+<br>
+fork1:<br>
+ call void @a()<br>
+ br label %exit<br>
+<br>
+fork2:<br>
+ call void @b()<br>
+ br label %exit<br>
+<br>
+exit:<br>
+ call void @hot_function()<br>
+ ret void<br>
+}<br>
+<br>
!5 = !{!"branch_weights", i32 84, i32 16}<br>
!6 = !{!"function_entry_count", i32 10}<br>
!7 = !{!"branch_weights", i32 60, i32 40}<br>
+!8 = !{!"branch_weights", i32 5001, i32 4999}<br>
+!9 = !{!"branch_weights", i32 85, i32 15}<br>
+!10 = !{!"branch_weights", i32 90, i32 10}<br>
<br>
<br>
______________________________<wbr>_________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
</blockquote>
<br>
<br>
______________________________<wbr>_________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
</div></div></blockquote></div><br></div>