[llvm-dev] [cfe-dev] RFC: End-to-end testing

Thu Oct 10 15:29:22 PDT 2019

On Thu, 10 Oct 2019 at 22:26, David Greene <dag at cray.com> wrote:
> That would be a shame.  Where is test-suite run right now?  Are there
> bots?  How are regressions reported?

There is no shame in making the test-suite better.

We do have bots running them in full CI for multiple targets, yes.
Regressions are reported and fixed. The benchmarks are also followed
by a smaller crowd and regression on those are also fixed (but
slower).

I'm not proposing to move e2e off to a dark corner, I'm proposing to
have a scaled testing strategy that can ramp up and down as needed,
without upsetting the delicate CI and developer balance.

Sure, e2e tests are important, but they need to catch bugs that the
other tests don't catch, not being our front-line safety net.

We planned doing incremental testing with buildbots for years and
Apple has done something like that in their GreenBots. We have talked
to move that upstream, but time spent in testing is really really
scant.

A few years back there was a big effort to clean up the LIT tests from
duplicates and speed up inefficient code, and a lot of tests are
removed. If we just add the e2e today and they never catch anything
relevant, they'll be the next candidates to go.

The delta that e2e can test is really important, but really small and
fairly rare. So running it less frequent (every few dozen commits)
will most likely be enough for anything we can possibly respond to
upstream.

My main point is that we need to be realistic with what we can do
upstream, which is very different from which a big company can do
downstream.

Past experiences have, over and over, shown us that new shiny CI toys
get rusty, noisy, and dumped.

We want to have the tests, in a place anyone can test, that the bots
*will* test periodically, and that don't annoy developers often enough
to be a target.

In a nutshell:
 * We still need src2src tests, to ensure connection points (mainly
IR) are canonical and generic, avoiding hidden contracts
 * We want the end2end tests to *add* coverage, not overlap with or
replace existing tests
 * We don't want those tests to become a burden to developers by
breaking on unrelated changes and making bots red for obscure reasons
 * We don't want them to be a burden to our CI efforts, slowing down
regular LIT testing and becoming a target for removal

The orders of magnitude for number of commits we want to run tests are:
 * LIT base, linker, compiler-RT, etc: ~1
 * Test-suite correctness, end-2-end: ~10
 * Multi-stage build, benchmarks: ~100

We already have that ratio (somewhat) with buildbots, so it should be
simple to add e2e to the test suite at the right scale.

> > The last thing we want is to create direct paths from front-ends to
> > back-ends and make LLVM IR transformation less flexible.
>
> I'm not sure I follow.  Can you explain this a bit?

Right, I had written a long paragraph about it but deleted in the
final version of my email. :)

The main point is that we want to avoid hidden contracts between the
front-end and the back-end.

We want to make sure all front-ends can produce canonical IR, and that
the middle-end can optimise the IR and that the back-end can lower
that to asm in a way that runs correctly on the target. As we have
multiple back-ends and are soon to have a second official front-end,
we want to make sure we have good coverage on the multi-step tests
(AST to IR, IR to asm, etc).

If we add e2e tests that are not covered by piece-wise tests, we risk
losing that clarity.

I think e2e tests have to expose more complex issues, like front-end
changes, pass manager order, optimisation levels, linking issues, etc.
They can check for asm, run on the target, or both. In the test-suite
we have more budget to do a more complete job at it than in LIT
check-all.

Hope this helps.

cheers,
--renato