[PATCH] D21663: [MBP] Enhance cost based branch prob threshold computation to handle general control flows

Thu Jun 23 22:50:32 PDT 2016

danielcdh added inline comments.

================
Comment at: lib/CodeGen/MachineBlockPlacement.cpp:624
@@ +623,3 @@
+    //  order for BB->Succ to be selected, we must have F1 > F2 + F3. So the
+    //  backward probability threshold is (backward checking igores other
+    //  predecessors other than Pred2):
----------------
davidxl wrote:
> davidxl wrote:
> > danielcdh wrote:
> > > This is a little confusing, I think the condition should be:
> > > 
> > > F1 > max(2*F3, F2)
> > > 
> > > So when checking Pred1, the backward threhold should be 0.66, when checking Pred2, the backward threshold should be 0.5?
> > > 
> > > But for the current implementation, I tried all testcase, and some other testcases with different triangle-diamond combination, and it always gives me optimal solution.
> > It is F1 > F2 + F3.
> > 
> > Basically, we are comparing selecting BB->Succ as fall through vs selecting BB->Pred1 and Pred2->Succ as fall throughs. In order for BB->Succ to succeed, the minimal min(F1) = F2 + F3.
> > 
> > When computing backward probability, only BB->Succ and Pred2->Succ edges are considered, so the probability threshold 
> >   T = min(F1) /(min(F1) + F2) = (F2 + F3 )/(2*F2 + F3).
> > 
> > I will update the comments.
> Dehao's question:
> 
> For case C, if F1=50, F2=40, F3=20, it seems it's most beneficial to choose BB->Succ as fall-through than Pred1->Succ or Pred2->Succ, or am I missing something?
> 
> The answer is, in this case, it is better not to choose BB->Succ as the fall through.
> 
> Let's take a look at an example: suppose BB and Pred2 in example C has a predecessor 'S'. With your choice, the layout is 
> 
> S, BB, Succ, Pred2, Pred1  with cost F2 + F3 + F2 + F3
> 
> The optimal layout is in fact
> 
> S, BB, P1, P2, Succ with cost  F1 + F2 + F3
> 
> 
I see. Thanks for explanation.

It may worth mentioning in the comment that the threshold is F1>max(2*F3, F2+F3), because F2>F3, thus F1>F2+F3.

================
Comment at: lib/CodeGen/MachineBlockPlacement.cpp:664
@@ +663,3 @@
+    BlockFrequency F2, F3;
+    if (MDT->properlyDominates(BB, PredWithMaxFreq)) {
+      // Scenario (A)
----------------
Maybe combine the two scenarios to something like:

F2 = std::max(MBFI->getBlockFreq(BB) * SuccProb.getCompl();, MaxPredEdgeFreq);
F3 = std::min(MBFI->getBlockFreq(BB) * SuccProb.getCompl();, MaxPredEdgeFreq);

http://reviews.llvm.org/D21663