<div dir="ltr">Right - you usually won't see a normal distribution in the noise of test results. You'll see results clustered around the lower bound with a long tail of slower and slower results. Depending on how many samples you do it might be appropriate to take the mean of the best 3, for example - but the general approach of taking the fastest N does have some basis in any case.<br>

<br>Not necessarily the right answer, the only right answer, etc.</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jan 16, 2014 at 5:05 PM, Chris Matthews <span dir="ltr"><<a href="mailto:chris.matthews@apple.com" target="_blank">chris.matthews@apple.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I think the idea with min is that it would the the ideal fastest run.  The other runs were ‘slowed' by system noise or something else.<br>


<div class="HOEnZb"><div class="h5"><br>

<br>

On Jan 16, 2014, at 5:03 PM, Tobias Grosser <<a href="mailto:tobias@grosser.es">tobias@grosser.es</a>> wrote:<br>

<br>

> Hi,<br>

><br>

> I am currently investigating how to ensure that LNT only shows relevant performance regressions for the -O3 performance tests I am running.<br>

><br>

> One question that came up here is why the default aggregate function for LNT is 'min' instead of 'mean'. This looks a little surprising from the statistical point, but also from looking at my test results picking 'min' seems to be an inferior choice.<br>


><br>

> For all test runs I have looked at, picking mean largely reduces the run-over-run changes reported due to noise.<br>

><br>

> See this run e.g:<br>

><br>

> If we use the median, we just get just one change reported:<br>

><br>

> <a href="http://llvm.org/perf/db_default/v4/nts/20661?num_comparison_runs=10&test_filter=&test_min_value_filter=&aggregation_fn=median&compare_to=20659&submit=Update" target="_blank">http://llvm.org/perf/db_default/v4/nts/20661?num_comparison_runs=10&test_filter=&test_min_value_filter=&aggregation_fn=median&compare_to=20659&submit=Update</a><br>


><br>

> If you use min, we get eight reports one claiming over 100% performance<br>

> reduction for a case that is really just pure noise. I am planning to look into using better statistical methods. However, as a start, could we switch the default to 'mean'?<br>

><br>

> Cheers,<br>

> Tobias<br>

<br>

<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

</div></div></blockquote></div><br></div>