Using lnt to track lld performance

Tue Oct 31 11:08:18 PDT 2017

Right now performance tracking in lld is a manual and very laborious
process.

I decided to take some time off main development to automate it a
bit.

The current state is that we have a bunch of test in
https://s3-us-west-2.amazonaws.com/linker-tests/lld-speed-test.tar.xz. Each
test is a program that lld can link (clang, chrome, firefox, etc).

Right now there is just a hackish run.sh that links every program with
two versions of lld to compare the performance. Some programs are also
linked in different modes. For example, we compare chrome with and
without icf.

I want to improve this in two ways:

* Make the directory structure uniform. What I am currently prototyping
  is that each program is its own directory and it can have multiple
  response*.txt files. Each response file is an independent test. The
  reason for allowing multiple response files is to allow for variants:
  * response.txt
  * response-icf.txt
  * response-gc.txt

* Instead of just comparing lld's run time, parse and save various
  metrics to a database.

The database hierarchy would be:

So for each llvm revision there will be multiple benchmarks (chrome,
chrome-icf, clang).

For each benchmark, there will be multiple metrics (lld runtime, branches,
output size, etc).

Some output metrics will include multiple measurements. The output size
should always be the same, but we can have multiple runs with slightly
different time for example.

Not too surprisingly, the above structure is remarkably similar to what lnt
uses:
http://llvm.org/docs/lnt/importing_data.html#importing-data-in-a-text-file

So my idea is to first write a python script that runs all the
benchmarks in the above structure and submits the results to a lnt
server. This would replace the existing run.sh.

For comparing just two revisions locally, this will be a bit more work.

But it should allow us to setup a bot that continually submits lld
performance results.

For those working on lld, does the above sound like a good idea?

For those working on lnt, is it a good match for the above? The main use
case I would like is to build graphs of various metrics over llvm
revisions. For example:

* chrome's output size over the last 1000 llvm revisions.
* firefox link time (with error bars) over that last 1000 llvm revisions.

Cheers,
Rafael