<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">Hey Hal,</div><div class="gmail_quote"><br></div><div class="gmail_quote">The code is fine. The only question is whether these are good heuristics to add.</div>

<div class="gmail_quote"><br></div><div class="gmail_quote">I'm 98% confident about the -1 heuristic, but see below. I'm about 2% confident about the power-of-2 heuristic. Powers of two happen for all sorts of reasons, not just bitfields. I don't know that we can really generalize from this with any realistic hope of accurate behavior downstream.</div>

<div class="gmail_quote"><br></div><div class="gmail_quote">On Mon, Oct 28, 2013 at 1:32 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank" class="cremed">hfinkel@anl.gov</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div id=":48z" style="overflow:hidden">I really think that this depends on the platform. I have some internal codes for which I know that the -1 case helps. In looking through the (fairly sparse) literature on this topic, I found that PACT'98 paper. That paper suggests a number of heuristics based on an analysis of some set of SPEC benchmarks, many of which we already do. Of those that we don't currently implement, I think that there are three that are worth implementing now: The error-reporting functions (in the other BPI patch I posted), and the two cases in these patches. Both of these make intuitive sense to me, and seem fairly general.<br>

</div></blockquote><div><br></div><div>I'm extremely dubious of the literature here. That's being kind. I think all of the literature here is pulling numbers out of thin air without any realistic connection to optimizing real world application code. =/</div>

<div><br></div><div>I think it would be really useful to do justify any further heuristics with benchmarks. FWIW, I'm happy with a story like: "This helps specific benchmark X which is important to real application code, and doesn't cause any regressions on the nightly test suite." I'm not expecting across-the-board improvement (almost never happens), just that these are motivated by actual code improving rather than just literature results, and that they don't regress code that we know about.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div id=":48z" style="overflow:hidden">

<br>

Can you see if there is an effect of these (positive or negative) on ARM? I did not see much of an affect on my x86 build machine (which is really too noisy to run these kinds of comparative benchmarks). On the other hand, on my embedded PPC cores, getting better block placement has a larger effect.</div>

</blockquote></div><br>I can try to run them too, but honestly, I'd be interested in your specific architectural impact given its greater sensitivity to changes.</div></div>