<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Apr 19, 2013 at 1:13 PM, Renato Golin <span dir="ltr"><<a href="mailto:renato.golin@linaro.org" target="_blank">renato.golin@linaro.org</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="im">On 19 April 2013 17:48, Török Edwin <span dir="ltr"><<a href="mailto:edwin@etorok.net" target="_blank">edwin@etorok.net</a>></span> wrote:<br>

<div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div><span style="color:rgb(34,34,34)">Otherwise what might seem like a 20% improvement</span><br></div>

could very well be just a 0.2% improvement in practice.<br></blockquote><div></div></div><br></div></div><div class="gmail_extra">This is (maybe to a lesser extent) what happens with most of our benchmarks, and running them 3 times doesn't add that much confidence but makes it run much slower. I end up treating the test-suite as functionality and correctness test, rather than useful benchmark data.</div>


<div class="gmail_extra"><br></div><div class="gmail_extra">I agree it would be great to have a decent benchmark infrastructure for LLVM, but I'm not sure the test-suite is the appropriate place. Maybe a different type of run that crank the inputs up to 11 and let the applications run for longer, to be run once a week or so wouldn't be a bad idea, though.</div>


<div class="gmail_extra"><br></div></div></blockquote><div style><br></div><div style>A simple benchmark that we run "all the time" is how long it takes for us to compile ourselves. Do we track this?</div><div style>

<br></div><div style>-- Sean Silva </div></div></div></div>