<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Feb 27, 2017 at 2:00 PM, Kyle Butt via Phabricator <span dir="ltr"><<a href="mailto:reviews@reviews.llvm.org" target="_blank">reviews@reviews.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">iteratee added a comment.<br>

<span class="gmail-"><br>

In <a href="https://reviews.llvm.org/D30309#686772" rel="noreferrer" target="_blank">https://reviews.llvm.org/<wbr>D30309#686772</a>, @davidxl wrote:<br>

<br>

> if BP is not correct,  it is better to improve static branch prediction.  We explicitly added a threshold for the cost based analysis result to kick in just to be conservative when the branch probability is not biased enough.    Even for the long chain case, tail dup is enabled for 50/50 case, but the real profile is 40/60, taildup will hurt performance. I don't see the reason to by pass the branch prob + cost analysis by just looking at the shape.<br>

<br>

<br>

</span>Well, long chains amortize the penalty, so looking for the shape is definitely necessary.<br>

<br>

I can adjust the static prediction if you'd like, but I have a source for the 60/40 stat:<br></blockquote><div><br></div><div><br></div><div>The paper says the T/N ratio is about 2/1, which contradicts to your case here. </div><div><br></div><div><br></div><div>There are more recent papers which looks at actual branch conditions:</div><div><br></div><div><div> [1] "Branch Prediction for Free"        Ball and Larus; PLDI '93.</div><div> [2] "Static Branch Frequency and Program Profile Analysis"      Wu and Larus; MICRO-27.</div><div> [3] "Corpus-based Static Branch Prediction"       Calder, Grunwald, Lindsay, Martin, Mozer, and Zorn; PLDI '95.  */</div></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

See page 13 here:<br>

<a href="http://digitalassets.lib.berkeley.edu/techreports/ucb/text/CSD-83-121.pdf" rel="noreferrer" target="_blank">http://digitalassets.lib.<wbr>berkeley.edu/techreports/ucb/<wbr>text/CSD-83-121.pdf</a><br>

<br>

The chains also allow us to make a correlation assumption. I can explicitly calculate that as well, however, edge frequencies run into aliasing problems. It seems that BlockFrequency wasn't designed to allow for calculations like these.<br>

<br></blockquote><div><br></div><div>Without path profile, how is it reasonable to make path correlation assumptions?  In some cases, it is true some static analysis + static heuristic can be developed to handle that.</div><div><br></div><div>However you made a valid point here. Before we have the machinery to analysis/annotate/use such correlations, pattern matching may be the way to go. However it is probably not a good idea to blindly look at the shapes. Perhaps looking into conditions as well?  This also looks like more suitable to be put into some utility. </div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

I really don't want to change this until I get a more specific feedback about what you'd like to see.<br>

Assuming a small amount of positive correlation (10%), the cutoff is 47% (including the frequency bonus) for a chain of 2 triangles that ends in a non-triangle,<br>

and 56% for a chain of triangles that ends in a triangle.<br>

<br></blockquote><div><br></div><div>yes, if we can prove correlation, the longer the chain, the larger the taken branch probability is allowed to enable tail-dup.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

Would you prefer that I adjust the static probabilities for triangles, </blockquote><div><br></div><div>This is probably wrong to do (i.e, guess BP based on shape). I suspect the benefit is mostly from the path correlation. </div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">and then run the comparisons against the thresholds I calculated above? I could even include the whole table from 2-10. (The threshold goes down as the # of triangles goes up)<br>

<div class="gmail-HOEnZb"><div class="gmail-h5"><br></div></div></blockquote><div><br></div><div><br></div><div>Can you do some analysis on the benchmarks that benefit the long chain duplication and study if it is possible to develop some simple heuristics of path correlation?</div><div><br></div><div>David</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail-HOEnZb"><div class="gmail-h5">

<br>

Repository:<br>

  rL LLVM<br>

<br>

<a href="https://reviews.llvm.org/D30309" rel="noreferrer" target="_blank">https://reviews.llvm.org/<wbr>D30309</a><br>

<br>

<br>

<br>

</div></div></blockquote></div><br></div></div>