[PATCH] D30309: CodeGen: BlockPlacement: Precompute layout for chains of triangles.

Thu Mar 2 11:01:37 PST 2017

iteratee added a comment.

In https://reviews.llvm.org/D30309#689911, @chandlerc wrote:

> In https://reviews.llvm.org/D30309#687816, @iteratee wrote:
>
> > In https://reviews.llvm.org/D30309#686772, @davidxl wrote:
> >
> > > if BP is not correct,  it is better to improve static branch prediction.  We explicitly added a threshold for the cost based analysis result to kick in just to be conservative when the branch probability is not biased enough.    Even for the long chain case, tail dup is enabled for 50/50 case, but the real profile is 40/60, taildup will hurt performance. I don't see the reason to by pass the branch prob + cost analysis by just looking at the shape.
> >
> >
> > Well, long chains amortize the penalty, so looking for the shape is definitely necessary.
> >
> > I can adjust the static prediction if you'd like, but I have a source for the 60/40 stat:
> >  See page 13 here:
> >  http://digitalassets.lib.berkeley.edu/techreports/ucb/text/CSD-83-121.pdf
>
>
> FWIW, I remember discussing this a loooooong time ago as we were really starting to set up static prediction. At the time, there was some desire to not try to put this weak of signal into static predictions. They have ways of compounding and ending up producing pretty weird results. So I'm not really sure the probabilities we use in static prediction are wrong. (Or rather, are wrong by enough of a margin or in enough cases to be worth shifting.) Maybe we should revisit this, but I'm always a bit skeptical of static heuristics with this small of a difference...
>
> > The chains also allow us to make a correlation assumption. I can explicitly calculate that as well, however, edge frequencies run into aliasing problems. It seems that BlockFrequency wasn't designed to allow for calculations like these.
>
> Based on your explanation to me about how all of this works, my understanding is this:
>
> Branches in these kinds of long chains of triangles empirically correlate, even though they may individually have something like 50/50 probability. And the advantage of *correlation* is pretty specific to the *layout* we're doing here.
>
> Given that, I think it is very reasonable to handle this within the layout code by detecting the pattern of CFG combined with (nearly) 50/50 probabilities, and choosing to prioritize a layout this is profitable in the face of correlation because we believe that such correlation will often occur.
>
> Anyways, just my two cents. I'll leave figuring out the end state to you and David. =D

So correlation is an interesting term. There are actually 2 ways that the branches may be correlated:

1. They may be biased in the same direction. We guess 50% for unknown branches, and over all of them that may be pretty close, but for any individual branch it's unlikely to be 50%. Tail duplication is profitable if the biases go the same direction, even if they are independent at runtime.
2. They may be dynamically correlated. If the branch is close to 50%, but they are positively correlated, this is also profitable.

It's also profitable if the branches are independent, but each branch is slightly more than 50%: 58% for 2 triangles in a row, and 56% for 3 triangles in a row. (This includes for a 2% penalty for size increases)

Repository:
  rL LLVM

https://reviews.llvm.org/D30309