<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 7, 2014 at 5:31 PM, Nadav Rotem <span dir="ltr"><<a href="mailto:nrotem@apple.com" target="_blank">nrotem@apple.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div id=":15n" class="a3s" style="overflow:hidden">I think that removing the branch count limit make sense.  Do you mind sharing the performance data?  Were there any regressions or performance wins?<br>

</div></blockquote><div><br></div><div>I saw no changes on sandybridge (or any other Intel or AMD processor) when experimentally disabling this threshold. Neither regressions nor improvements.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div id=":15n" class="a3s" style="overflow:hidden">

<br>

I am asking about the performance data because I am guessing that there were some benchmarks that benefited from this heuristics otherwise it wouldn’t have made it in.</div></blockquote></div><div class="gmail_extra"><br>

</div>It wasn't based on data, it was based on a reading of the relevant optimization guidelines. However, those guidelines are actually talking about *taken* branches, and we have no terribly easy way to model that. Also, it turns out to be very hard to craft a case where you would exceed the number of taken branches and yet stay under the basic threshold, so it seems like the basic one is the only one needed.<br>

<br></div></div>