[LLVMdev] [LNT] Question about results reliability in LNT infrustructure

Sun Jun 30 02:14:50 PDT 2013

Hi Tobi,

First of all, all this is http://llvm.org/bugs/show_bug.cgi?id=1367 :)

> The statistical test ministat is performing seems simple and pretty
> standard. Is there any reason we could not do something similar? Or are we
> doing it already and it just does not work as expected?
The main problem with such sort of tests is that we cannot trust them, unless:
1. The data has the normal distribution
2. The sample size if large (say, > 50)

Here we have only 3 points and, no, I won't trust the ministat's
t-test and normal-approximation based confidence bounds. They are *too
short* (=the real confidence level is no 99.5%, but, actually 40-50%,
for example).

I'd ask for:

1. Increasing sample size to at least 5-10
2. Do the Wilcoxon/Mann-Whitney test

What do you think?

--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University