[LLVMdev] [LNT] Question about results reliability in LNT infrustructure

Chris Matthews chris.matthews at apple.com
Sun Jun 30 18:04:39 PDT 2013


I think we need to be using tests with the fewest assumptions possible.  I don’t think there are many assumptions that would hold for all the benchmarks.

Chris Matthews
chris.matthews at apple.com
phone: 36335


On Jun 30, 2013, at 12:05 PM, Anton Korobeynikov <anton at korobeynikov.info> wrote:

> Hi Tobias,
> 
>> I trust your knowledge about statistics, but am wondering why ministat (and
>> it's t-test) is promoted as a statistical sane tool for benchmarking
>> results.
> I do not know... Ask author of ministat?
> 
>> Is the use of the t-test for benchmark results a bad idea in
>> general?
> No, in general. But one should be aware about the assumptions of the
> underlying theory. t-test is fine as soon as our data follows the
> normal distribution (and hence the test would be exact) or the sample
> size is large (then we have the asymptotic normality of the mean due
> to CLT).
> 
>> Would ministat be a better tool if it implemented the
>> Wilcoxon/Mann-Whitney test?
> The precision would be much better for small sample sizes (say, in range 10-50).
> 
> But in any case, never trust someone who will claim he can reliably
> estimate the variance from 3 data points.
> 
>> Is there anything stopping us from implementing such a test and exposing its
>> results in the UI?
> I do not think so...
> 
> --
> With best regards, Anton Korobeynikov
> Faculty of Mathematics and Mechanics, Saint Petersburg State University
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130630/722d2962/attachment.html>


More information about the llvm-dev mailing list