<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Feb 27, 2017 at 2:27 PM, Xinliang David Li <span dir="ltr"><<a href="mailto:davidxl@google.com" class="m_-7761920284598067042cremed cremed" target="_blank">davidxl@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Mon, Feb 27, 2017 at 2:00 PM, Kyle Butt via Phabricator <span dir="ltr"><<a href="mailto:reviews@reviews.llvm.org" class="m_-7761920284598067042cremed cremed" target="_blank">reviews@reviews.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">iteratee added a comment.<br>

<span class="m_-7761920284598067042m_62593311590684752gmail-"><br>

In <a href="https://reviews.llvm.org/D30309#686772" rel="noreferrer" class="m_-7761920284598067042cremed cremed" target="_blank">https://reviews.llvm.org/D3030<wbr>9#686772</a>, @davidxl wrote:<br>

<br>

> if BP is not correct,  it is better to improve static branch prediction.  We explicitly added a threshold for the cost based analysis result to kick in just to be conservative when the branch probability is not biased enough.    Even for the long chain case, tail dup is enabled for 50/50 case, but the real profile is 40/60, taildup will hurt performance. I don't see the reason to by pass the branch prob + cost analysis by just looking at the shape.<br>

<br>

<br>

</span>Well, long chains amortize the penalty, so looking for the shape is definitely necessary.<br>

<br>

I can adjust the static prediction if you'd like, but I have a source for the 60/40 stat:<br></blockquote><div><br></div><div><br></div></span><div>The paper says the T/N ratio is about 2/1, which contradicts to your case here.</div></div></div></div></blockquote><div><br></div><div>That strengthens my case. I'm assuming 60/40 Taken vs not-taken.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div><br></div><div>There are more recent papers which looks at actual branch conditions:</div><div><br></div><div><div> [1] "Branch Prediction for Free"        Ball and Larus; PLDI '93.</div><div> [2] "Static Branch Frequency and Program Profile Analysis"      Wu and Larus; MICRO-27.</div><div> [3] "Corpus-based Static Branch Prediction"       Calder, Grunwald, Lindsay, Martin, Mozer, and Zorn; PLDI '95.  */</div></div><span><div><br></div></span></div></div></div></blockquote><div><br></div><div>Sure, I'll look at them.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

See page 13 here:<br>

<a href="http://digitalassets.lib.berkeley.edu/techreports/ucb/text/CSD-83-121.pdf" rel="noreferrer" class="m_-7761920284598067042cremed cremed" target="_blank">http://digitalassets.lib.berke<wbr>ley.edu/techreports/ucb/text/C<wbr>SD-83-121.pdf</a><br>

<br>

The chains also allow us to make a correlation assumption. I can explicitly calculate that as well, however, edge frequencies run into aliasing problems. It seems that BlockFrequency wasn't designed to allow for calculations like these.<br>

<br></blockquote><div><br></div></span><div>Without path profile, how is it reasonable to make path correlation assumptions?  In some cases, it is true some static analysis + static heuristic can be developed to handle that.</div><div><br></div></div></div></div></blockquote><div><br></div><div>Well, they are light assumptions (10%), and I planned on justifying them with benchmarks. So far, they have proven worthwhile.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><div>However you made a valid point here. Before we have the machinery to analysis/annotate/use such correlations, pattern matching may be the way to go. However it is probably not a good idea to blindly look at the shapes. Perhaps looking into conditions as well?  This also looks like more suitable to be put into some utility. </div><span><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

I really don't want to change this until I get a more specific feedback about what you'd like to see.<br>

Assuming a small amount of positive correlation (10%), the cutoff is 47% (including the frequency bonus) for a chain of 2 triangles that ends in a non-triangle,<br>

and 56% for a chain of triangles that ends in a triangle.<br>

<br></blockquote><div><br></div></span><div>yes, if we can prove correlation, the longer the chain, the larger the taken branch probability is allowed to enable tail-dup.</div></div></div></div></blockquote><div><br></div><div>OK, assuming independence, the thresholds are 51% for a chain of triangles that end in a non-triangle and 58% if it ends in a triangle.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

Would you prefer that I adjust the static probabilities for triangles, </blockquote><div><br></div></span><div>This is probably wrong to do (i.e, guess BP based on shape). I suspect the benefit is mostly from the path correlation. </div><span><div><br></div></span></div></div></div></blockquote><div><br></div><div>I don't see why it would be wrong. If taken/non-taken really is 2/1, I would expect that effect to show up most in triangle shapes vs other shapes where layout would allow you to reduce the # of taken branches.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">and then run the comparisons against the thresholds I calculated above? I could even include the whole table from 2-10. (The threshold goes down as the # of triangles goes up)<br>

<div class="m_-7761920284598067042m_62593311590684752gmail-HOEnZb"><div class="m_-7761920284598067042m_62593311590684752gmail-h5"><br></div></div></blockquote><div><br></div><div><br></div></span><div>Can you do some analysis on the benchmarks that benefit the long chain duplication and study if it is possible to develop some simple heuristics of path correlation?</div></div></div></div></blockquote><div><br></div><div>I actually worry that would overfit worse. It appears neutral for all the things I've tested besides protocol buffers. Are there more things you'd like me to test?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="m_-7761920284598067042HOEnZb"><font color="#888888"><div><br></div><div>David</div></font></span><span><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="m_-7761920284598067042m_62593311590684752gmail-HOEnZb"><div class="m_-7761920284598067042m_62593311590684752gmail-h5">

<br>

Repository:<br>

  rL LLVM<br>

<br>

<a href="https://reviews.llvm.org/D30309" rel="noreferrer" class="m_-7761920284598067042cremed cremed" target="_blank">https://reviews.llvm.org/D3030<wbr>9</a><br>

<br>

<br>

<br>

</div></div></blockquote></span></div><br></div></div>

</blockquote></div><br></div></div>