[LLVMdev] [LNT] Question about results reliability in LNT infrustructure

Sun Jun 30 11:30:13 PDT 2013

On 30 June 2013 10:14, Anton Korobeynikov <anton at korobeynikov.info> wrote:

> 1. Increasing sample size to at least 5-10
>

That's not feasible on slower systems. A single data point takes 1 hour on
the fastest ARM board I can get (Chromebook). Getting 10 samples at
different commits will give you similar accuracy if behaviour doesn't
change, and you can rely on 10-point blocks before and after each change to
have the same result.

What won't happen is one commit makes it truly faster and the very next
slow again (or slow/fast), so all we need to measure is for each commit, if
that was the one that made all next runs slower/faster, and that we can get
with several commits after the culprit, since the probability that another
(unrelated) commit will change the behaviour is small.

This is why I proposed something like moving averages. Not because it's the
best statistical model, but because it works around a concrete problem we
have. I don't care which model/tool you use, as long as it doesn't mean
I'll have to wait 10 hours for a result, or sift through hundreds of
commits every time I see a regression in performance. What that will do,
for sure, is make me ignore small regressions, since they won't be worth
the massive work to find the real culprit.

If I had a team of 10 people just to look at regressions all day long, I'd
ask them to make a proper statistical model and go do more interesting
things...

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130630/daa16eff/attachment.html>