[www] r176209 - Add LNT statistics project

Thu Feb 28 12:25:14 PST 2013

On Thu, Feb 28, 2013 at 12:09 PM, Renato Golin <renato.golin at linaro.org> wrote:
> On 28 February 2013 19:28, David Blaikie <dblaikie at gmail.com> wrote:
>>
>> I'm still confused as to which things you're talking about. My
>> suggestion is that we can get higher confidence on the performance of
>> tests by running them multiple times.
>
>
> Ok, now I know why I was confused... and ended up confusing you...
>
> My idea was not to run the whole test multiple times to achieve a nice
> mean+sigma, but to increase the number of iterations inside one single run
> of the test.
>
> The big problems I find in benchmarks is that the prologue/epilogue normally
> is more non-deterministic than the test itself, because they're based on
> I/O, setting up the memory (thus populating caches, etc). If you run
> multiple times, you'll end up with a small-enough sigma over the whole test,
> but not small enough over the prologue/epilogue.
>
> So, going back all the way to the beginning: I propose to spot the core of
> each benchmark and force all of them to print times of just that core.

Ah, OK - that's more likely what we'll get from using a microbenchmark
suite - designed for doing timing/etc in-process over very strictly
scoped sections of code. I think, yes, many of our current test suite
could/should be adapted to such a system. Whether or not every test
would use such infrastructure, I'm not sure - it does require more
maintenance due to having to modify the code of every benchmark rather
than just integrate their build system.

> If you need to run a few times to normalize memory/cache, do it before start
> timing. If the benchmark is a set of many calls, wrap the working part of
> main into a separate function and call it many times. In essence, everything
> to make the preparation and shutdown phases nor relevant.
>
> It's not the same as you're proposing at all, and I see value on both
> propositions. Yours is a bit easier to implement, and may have good enough
> results for the time being. But I think that we should think of a library to
> measure time on all platforms (easier said than done), so we can have
> multiple time tests on the same benchmark, and possibly simplify the whole
> infrastructure in the process (by making the benchmarks more complex).
>
> Now that I got what you mean, I think it's a simple idea that should be done
> before my proposition, possibly together with my GSoC project.
>
> Sorry for the confusion, I really got it upside-down. ;)

No worries - got there in the end.

- David