[PATCH] D30309: CodeGen: BlockPlacement: Precompute layout for chains of triangles.

Wed Mar 1 12:23:24 PST 2017

chandlerc added a comment.

In https://reviews.llvm.org/D30309#687816, @iteratee wrote:

> In https://reviews.llvm.org/D30309#686772, @davidxl wrote:
>
> > if BP is not correct,  it is better to improve static branch prediction.  We explicitly added a threshold for the cost based analysis result to kick in just to be conservative when the branch probability is not biased enough.    Even for the long chain case, tail dup is enabled for 50/50 case, but the real profile is 40/60, taildup will hurt performance. I don't see the reason to by pass the branch prob + cost analysis by just looking at the shape.
>
>
> Well, long chains amortize the penalty, so looking for the shape is definitely necessary.
>
> I can adjust the static prediction if you'd like, but I have a source for the 60/40 stat:
>  See page 13 here:
>  http://digitalassets.lib.berkeley.edu/techreports/ucb/text/CSD-83-121.pdf

FWIW, I remember discussing this a loooooong time ago as we were really starting to set up static prediction. At the time, there was some desire to not try to put this weak of signal into static predictions. They have ways of compounding and ending up producing pretty weird results. So I'm not really sure the probabilities we use in static prediction are wrong. (Or rather, are wrong by enough of a margin or in enough cases to be worth shifting.) Maybe we should revisit this, but I'm always a bit skeptical of static heuristics with this small of a difference...

> The chains also allow us to make a correlation assumption. I can explicitly calculate that as well, however, edge frequencies run into aliasing problems. It seems that BlockFrequency wasn't designed to allow for calculations like these.

Based on your explanation to me about how all of this works, my understanding is this:

Branches in these kinds of long chains of triangles empirically correlate, even though they may individually have something like 50/50 probability. And the advantage of *correlation* is pretty specific to the *layout* we're doing here.

Given that, I think it is very reasonable to handle this within the layout code by detecting the pattern of CFG combined with (nearly) 50/50 probabilities, and choosing to prioritize a layout this is profitable in the face of correlation because we believe that such correlation will often occur.

Anyways, just my two cents. I'll leave figuring out the end state to you and David. =D

Repository:
  rL LLVM

https://reviews.llvm.org/D30309