[PATCH] D21663: [MBP] Enhance cost based branch prob threshold computation to handle general control flows
Dehao Chen via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 23 22:50:32 PDT 2016
danielcdh added inline comments.
================
Comment at: lib/CodeGen/MachineBlockPlacement.cpp:624
@@ +623,3 @@
+ // order for BB->Succ to be selected, we must have F1 > F2 + F3. So the
+ // backward probability threshold is (backward checking igores other
+ // predecessors other than Pred2):
----------------
davidxl wrote:
> davidxl wrote:
> > danielcdh wrote:
> > > This is a little confusing, I think the condition should be:
> > >
> > > F1 > max(2*F3, F2)
> > >
> > > So when checking Pred1, the backward threhold should be 0.66, when checking Pred2, the backward threshold should be 0.5?
> > >
> > > But for the current implementation, I tried all testcase, and some other testcases with different triangle-diamond combination, and it always gives me optimal solution.
> > It is F1 > F2 + F3.
> >
> > Basically, we are comparing selecting BB->Succ as fall through vs selecting BB->Pred1 and Pred2->Succ as fall throughs. In order for BB->Succ to succeed, the minimal min(F1) = F2 + F3.
> >
> > When computing backward probability, only BB->Succ and Pred2->Succ edges are considered, so the probability threshold
> > T = min(F1) /(min(F1) + F2) = (F2 + F3 )/(2*F2 + F3).
> >
> > I will update the comments.
> Dehao's question:
>
> For case C, if F1=50, F2=40, F3=20, it seems it's most beneficial to choose BB->Succ as fall-through than Pred1->Succ or Pred2->Succ, or am I missing something?
>
> The answer is, in this case, it is better not to choose BB->Succ as the fall through.
>
> Let's take a look at an example: suppose BB and Pred2 in example C has a predecessor 'S'. With your choice, the layout is
>
> S, BB, Succ, Pred2, Pred1 with cost F2 + F3 + F2 + F3
>
> The optimal layout is in fact
>
> S, BB, P1, P2, Succ with cost F1 + F2 + F3
>
>
I see. Thanks for explanation.
It may worth mentioning in the comment that the threshold is F1>max(2*F3, F2+F3), because F2>F3, thus F1>F2+F3.
================
Comment at: lib/CodeGen/MachineBlockPlacement.cpp:664
@@ +663,3 @@
+ BlockFrequency F2, F3;
+ if (MDT->properlyDominates(BB, PredWithMaxFreq)) {
+ // Scenario (A)
----------------
Maybe combine the two scenarios to something like:
F2 = std::max(MBFI->getBlockFreq(BB) * SuccProb.getCompl();, MaxPredEdgeFreq);
F3 = std::min(MBFI->getBlockFreq(BB) * SuccProb.getCompl();, MaxPredEdgeFreq);
http://reviews.llvm.org/D21663
More information about the llvm-commits
mailing list