[LLVMdev] [LNT] Question about results reliability in LNT infrustructure

Sun Jun 30 12:05:40 PDT 2013

Hi Tobias,

> I trust your knowledge about statistics, but am wondering why ministat (and
> it's t-test) is promoted as a statistical sane tool for benchmarking
> results.
I do not know... Ask author of ministat?

> Is the use of the t-test for benchmark results a bad idea in
> general?
No, in general. But one should be aware about the assumptions of the
underlying theory. t-test is fine as soon as our data follows the
normal distribution (and hence the test would be exact) or the sample
size is large (then we have the asymptotic normality of the mean due
to CLT).

> Would ministat be a better tool if it implemented the
> Wilcoxon/Mann-Whitney test?
The precision would be much better for small sample sizes (say, in range 10-50).

But in any case, never trust someone who will claim he can reliably
estimate the variance from 3 data points.

> Is there anything stopping us from implementing such a test and exposing its
> results in the UI?
I do not think so...

--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University