[PATCH] D30309: CodeGen: BlockPlacement: Precompute layout for chains of triangles.

Mon Feb 27 15:54:55 PST 2017

On Mon, Feb 27, 2017 at 2:27 PM, Xinliang David Li <davidxl at google.com>
wrote:

>
>
> On Mon, Feb 27, 2017 at 2:00 PM, Kyle Butt via Phabricator <
> reviews at reviews.llvm.org> wrote:
>
>> iteratee added a comment.
>>
>> In https://reviews.llvm.org/D30309#686772, @davidxl wrote:
>>
>> > if BP is not correct,  it is better to improve static branch
>> prediction.  We explicitly added a threshold for the cost based analysis
>> result to kick in just to be conservative when the branch probability is
>> not biased enough.    Even for the long chain case, tail dup is enabled for
>> 50/50 case, but the real profile is 40/60, taildup will hurt performance. I
>> don't see the reason to by pass the branch prob + cost analysis by just
>> looking at the shape.
>>
>>
>> Well, long chains amortize the penalty, so looking for the shape is
>> definitely necessary.
>>
>> I can adjust the static prediction if you'd like, but I have a source for
>> the 60/40 stat:
>>
>
>
> The paper says the T/N ratio is about 2/1, which contradicts to your case
> here.
>

That strengthens my case. I'm assuming 60/40 Taken vs not-taken.

>
>
> There are more recent papers which looks at actual branch conditions:
>
>  [1] "Branch Prediction for Free"        Ball and Larus; PLDI '93.
>  [2] "Static Branch Frequency and Program Profile Analysis"      Wu and
> Larus; MICRO-27.
>  [3] "Corpus-based Static Branch Prediction"       Calder, Grunwald,
> Lindsay, Martin, Mozer, and Zorn; PLDI '95.  */
>
>
Sure, I'll look at them.

>
>
>> See page 13 here:
>> http://digitalassets.lib.berkeley.edu/techreports/ucb/text/CSD-83-121.pdf
>>
>> The chains also allow us to make a correlation assumption. I can
>> explicitly calculate that as well, however, edge frequencies run into
>> aliasing problems. It seems that BlockFrequency wasn't designed to allow
>> for calculations like these.
>>
>>
> Without path profile, how is it reasonable to make path correlation
> assumptions?  In some cases, it is true some static analysis + static
> heuristic can be developed to handle that.
>
>
Well, they are light assumptions (10%), and I planned on justifying them
with benchmarks. So far, they have proven worthwhile.

> However you made a valid point here. Before we have the machinery to
> analysis/annotate/use such correlations, pattern matching may be the way to
> go. However it is probably not a good idea to blindly look at the shapes.
> Perhaps looking into conditions as well?  This also looks like more
> suitable to be put into some utility.
>
>
>
>
>> I really don't want to change this until I get a more specific feedback
>> about what you'd like to see.
>> Assuming a small amount of positive correlation (10%), the cutoff is 47%
>> (including the frequency bonus) for a chain of 2 triangles that ends in a
>> non-triangle,
>> and 56% for a chain of triangles that ends in a triangle.
>>
>>
> yes, if we can prove correlation, the longer the chain, the larger the
> taken branch probability is allowed to enable tail-dup.
>

OK, assuming independence, the thresholds are 51% for a chain of triangles
that end in a non-triangle and 58% if it ends in a triangle.

>
>
>
>> Would you prefer that I adjust the static probabilities for triangles,
>
>
> This is probably wrong to do (i.e, guess BP based on shape). I suspect the
> benefit is mostly from the path correlation.
>
>
I don't see why it would be wrong. If taken/non-taken really is 2/1, I
would expect that effect to show up most in triangle shapes vs other shapes
where layout would allow you to reduce the # of taken branches.

>
>
>> and then run the comparisons against the thresholds I calculated above? I
>> could even include the whole table from 2-10. (The threshold goes down as
>> the # of triangles goes up)
>>
>>
>
> Can you do some analysis on the benchmarks that benefit the long chain
> duplication and study if it is possible to develop some simple heuristics
> of path correlation?
>

I actually worry that would overfit worse. It appears neutral for all the
things I've tested besides protocol buffers. Are there more things you'd
like me to test?

>
> David
>
>
>
>>
>> Repository:
>>   rL LLVM
>>
>> https://reviews.llvm.org/D30309
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170227/be9bec94/attachment.html>