[PATCH] D30309: CodeGen: BlockPlacement: Precompute layout for chains of triangles.

Mon Feb 27 22:13:03 PST 2017

On Mon, Feb 27, 2017 at 3:54 PM, Kyle Butt <iteratee at google.com> wrote:

>
>
> On Mon, Feb 27, 2017 at 2:27 PM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>>
>>
>> On Mon, Feb 27, 2017 at 2:00 PM, Kyle Butt via Phabricator <
>> reviews at reviews.llvm.org> wrote:
>>
>>> iteratee added a comment.
>>>
>>> In https://reviews.llvm.org/D30309#686772, @davidxl wrote:
>>>
>>> > if BP is not correct,  it is better to improve static branch
>>> prediction.  We explicitly added a threshold for the cost based analysis
>>> result to kick in just to be conservative when the branch probability is
>>> not biased enough.    Even for the long chain case, tail dup is enabled for
>>> 50/50 case, but the real profile is 40/60, taildup will hurt performance. I
>>> don't see the reason to by pass the branch prob + cost analysis by just
>>> looking at the shape.
>>>
>>>
>>> Well, long chains amortize the penalty, so looking for the shape is
>>> definitely necessary.
>>>
>>> I can adjust the static prediction if you'd like, but I have a source
>>> for the 60/40 stat:
>>>
>>
>>
>> The paper says the T/N ratio is about 2/1, which contradicts to your case
>> here.
>>
>
> That strengthens my case. I'm assuming 60/40 Taken vs not-taken.
>
>

>>
>> There are more recent papers which looks at actual branch conditions:
>>
>>  [1] "Branch Prediction for Free"        Ball and Larus; PLDI '93.
>>  [2] "Static Branch Frequency and Program Profile Analysis"      Wu and
>> Larus; MICRO-27.
>>  [3] "Corpus-based Static Branch Prediction"       Calder, Grunwald,
>> Lindsay, Martin, Mozer, and Zorn; PLDI '95.  */
>>
>>
> Sure, I'll look at them.
>
>
>>
>>
>>> See page 13 here:
>>> http://digitalassets.lib.berkeley.edu/techreports/ucb/text/C
>>> SD-83-121.pdf
>>>
>>> The chains also allow us to make a correlation assumption. I can
>>> explicitly calculate that as well, however, edge frequencies run into
>>> aliasing problems. It seems that BlockFrequency wasn't designed to allow
>>> for calculations like these.
>>>
>>>
>> Without path profile, how is it reasonable to make path correlation
>> assumptions?  In some cases, it is true some static analysis + static
>> heuristic can be developed to handle that.
>>
>>
> Well, they are light assumptions (10%), and I planned on justifying them
> with benchmarks. So far, they have proven worthwhile.
>
>
>> However you made a valid point here. Before we have the machinery to
>> analysis/annotate/use such correlations, pattern matching may be the way to
>> go. However it is probably not a good idea to blindly look at the shapes.
>> Perhaps looking into conditions as well?  This also looks like more
>> suitable to be put into some utility.
>>
>>
>>
>>
>>> I really don't want to change this until I get a more specific feedback
>>> about what you'd like to see.
>>> Assuming a small amount of positive correlation (10%), the cutoff is 47%
>>> (including the frequency bonus) for a chain of 2 triangles that ends in a
>>> non-triangle,
>>> and 56% for a chain of triangles that ends in a triangle.
>>>
>>>
>> yes, if we can prove correlation, the longer the chain, the larger the
>> taken branch probability is allowed to enable tail-dup.
>>
>
> OK, assuming independence, the thresholds are 51% for a chain of triangles
> that end in a non-triangle and 58% if it ends in a triangle.
>
>
>>
>>
>>
>>> Would you prefer that I adjust the static probabilities for triangles,
>>
>>
>> This is probably wrong to do (i.e, guess BP based on shape). I suspect
>> the benefit is mostly from the path correlation.
>>
>>
> I don't see why it would be wrong. If taken/non-taken really is 2/1, I
> would expect that effect to show up most in triangle shapes vs other shapes
> where layout would allow you to reduce the # of taken branches.
>

What I meant is that you can not statically predict the branch probability
by looking at the length of the triangle chains without empirical data.  I
suspect long chains of triangles might be a good indicator of branch
correlation, but I think it is good idea to collect some evidence from
common benchmarks.

>
>
>>
>>
>>> and then run the comparisons against the thresholds I calculated above?
>>> I could even include the whole table from 2-10. (The threshold goes down as
>>> the # of triangles goes up)
>>>
>>>
>>
>> Can you do some analysis on the benchmarks that benefit the long chain
>> duplication and study if it is possible to develop some simple heuristics
>> of path correlation?
>>
>
> I actually worry that would overfit worse. It appears neutral for all the
> things I've tested besides protocol buffers. Are there more things you'd
> like me to test?
>

Things that are interesting to check. 1) for the proto code, what is the
actual branch probabilities (using PGO to find out); 2) are their branch
correlations? 3) anything interesting related to branch conditions?

David

>
>
>>
>> David
>>
>>
>>
>>>
>>> Repository:
>>>   rL LLVM
>>>
>>> https://reviews.llvm.org/D30309
>>>
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170227/a075b44a/attachment.html>