Benchmarking some lld revisions

Wed May 31 16:45:55 PDT 2017

On Sun, May 28, 2017 at 11:38 PM, George Rimar via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> >Now that I have a workstation setup again for doing really stable
> >benchmarks I decided to try a quick experiment and benchmark linking
> >firefox with the last 1000 revisions or so.
> >
> >I used the last 1000 revision, regardless of what they changed. That
> >way we can get an idea of the fluctuation to expect from measurement
> >bias when unrelated things change.
> >
> >The results are attached. Data has 3 columns. The first one is the
> >revision, the second one is the time and the third is the percentage
> >std dev reported by perf over 10 runs.
>
> Looks great !
>
> >The plot was created with gnuplot:
> >
> >plot "data" using 1:2:($3 * $2 /100) title "time" with errorlines
> >
> >It is interesting to see that the variation from commit to commit is
> >much bigger than the variation in a single commit, but the graph
> >allows us to find some significant commits.
>
> Data in text representation might have one more column: change from
> previous revision in percents, that way even with text representation would
> be easy to find significant commits probably.
>

Having a human readable percentage would be useful when backreferencing
from the visualization to the revision number.
In general though, unless you are looking for a very specific thing (like
the exact revision number of a commit you found from a visualization),
scanning lists of numbers for outliers (or any insight whatsoever) is
extremely unreliable. It is also very time consuming because you are never
sure that you "caught everything" and so will keep scanning the data
looking for things you missed (and will still miss things).
At the very least, if you need to scan a list of numbers like percentage
increase/decrease, I would highly recommend importing it into a spreadsheet
and applying conditional formatting to color large values. Then it can be
scanned relatively quickly and reliably. Even better, just sort based on
the percentage increase/decrease.

For this reason, I would always recommend printing raw data in the least
human-readable format possible (while still being machine readable), e.g.
for printf I would recommend something horrible like `%.18e` to mitigate
the temptation to look at the raw data without some sort of analysis or
visualization.

With a good visualization (in this case, a simple scatter plot; for extra
credit, adding error bars like Rafael did), you can instantly see that
there are 3 major revisions to look at and no larger-scale "death by 1000
cuts" progressive slowdown.

-- Sean Silva

>
> > In this case, they were:
> >
> >r303689: The test has --buildit ... --build-id=none, and now the
> >second one is used.
> >
> >r303925: Uses CachedHashStringRef again for comdat signatures.
> >
> >Cheers,
> >Rafael
> >
> >P.S.: It would be truly awesome if someone could setup a bot that does
> >this over various tests.
>
> George.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170531/f9392153/attachment.html>