[PATCH] D21663: [MBP] Enhance cost based branch prob threshold computation to handle general control flows

Xinliang David Li via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 23 23:04:50 PDT 2016


On Thu, Jun 23, 2016 at 10:50 PM, Dehao Chen <danielcdh at gmail.com> wrote:

> danielcdh added inline comments.
>
> ================
> Comment at: lib/CodeGen/MachineBlockPlacement.cpp:624
> @@ +623,3 @@
> +    //  order for BB->Succ to be selected, we must have F1 > F2 + F3. So
> the
> +    //  backward probability threshold is (backward checking igores other
> +    //  predecessors other than Pred2):
> ----------------
> davidxl wrote:
> > davidxl wrote:
> > > danielcdh wrote:
> > > > This is a little confusing, I think the condition should be:
> > > >
> > > > F1 > max(2*F3, F2)
> > > >
> > > > So when checking Pred1, the backward threhold should be 0.66, when
> checking Pred2, the backward threshold should be 0.5?
> > > >
> > > > But for the current implementation, I tried all testcase, and some
> other testcases with different triangle-diamond combination, and it always
> gives me optimal solution.
> > > It is F1 > F2 + F3.
> > >
> > > Basically, we are comparing selecting BB->Succ as fall through vs
> selecting BB->Pred1 and Pred2->Succ as fall throughs. In order for BB->Succ
> to succeed, the minimal min(F1) = F2 + F3.
> > >
> > > When computing backward probability, only BB->Succ and Pred2->Succ
> edges are considered, so the probability threshold
> > >   T = min(F1) /(min(F1) + F2) = (F2 + F3 )/(2*F2 + F3).
> > >
> > > I will update the comments.
> > Dehao's question:
> >
> > For case C, if F1=50, F2=40, F3=20, it seems it's most beneficial to
> choose BB->Succ as fall-through than Pred1->Succ or Pred2->Succ, or am I
> missing something?
> >
> > The answer is, in this case, it is better not to choose BB->Succ as the
> fall through.
> >
> > Let's take a look at an example: suppose BB and Pred2 in example C has a
> predecessor 'S'. With your choice, the layout is
> >
> > S, BB, Succ, Pred2, Pred1  with cost F2 + F3 + F2 + F3
> >
> > The optimal layout is in fact
> >
> > S, BB, P1, P2, Succ with cost  F1 + F2 + F3
> >
> >
> I see. Thanks for explanation.
>
> It may worth mentioning in the comment that the threshold is F1>max(2*F3,
> F2+F3), because F2>F3, thus F1>F2+F3.
>

It is actually F1 > max(Freq(BB->X), for all X except Succ) +
max(Freq(Y->Succ), for all Y except BB), where the first term is F3, the
second is F3.



>
> ================
> Comment at: lib/CodeGen/MachineBlockPlacement.cpp:664
> @@ +663,3 @@
> +    BlockFrequency F2, F3;
> +    if (MDT->properlyDominates(BB, PredWithMaxFreq)) {
> +      // Scenario (A)
> ----------------
> Maybe combine the two scenarios to something like:
>
> F2 = std::max(MBFI->getBlockFreq(BB) * SuccProb.getCompl();,
> MaxPredEdgeFreq);
> F3 = std::min(MBFI->getBlockFreq(BB) * SuccProb.getCompl();,
> MaxPredEdgeFreq);
>

This may not always be true for scenario C -- consider extending scenario A
such that there is another Pred coming into Succ with the MaxPredEdgeFreq.
This MaxPredEdgeFreq may not be larger than MBFI->getBlockFreq(BB) *
SuccProb.getCompl().

David


>
>
> http://reviews.llvm.org/D21663
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160623/e1195959/attachment.html>


More information about the llvm-commits mailing list