[llvm-dev] Timeout tests timing out

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 21 11:33:17 PDT 2020


On Fri, Sep 18, 2020 at 4:31 PM Dan Liew <dan at su-root.co.uk> wrote:
>
> On Wed, 16 Sep 2020 at 22:24, David Blaikie <dblaikie at gmail.com> wrote:
> >
> > I appreciate the value of the feature - but it's possible the test
> > doesn't pull its weight. Is the code that implements the feature
> > liable to failure/often touched? If it's pretty static/failure is
> > unlikely, possibly the time and flaky failures aren't worth the value
> > of possibly catching a low-chance bug.
>
> I don't have many data points here. I haven't needed to touch the code
> or tests since 2018. I haven't been asked to review any patches
> related to the timeout feature since then so I presume nobody has
> touched it since.
>
>
> > Another option might be to reduce how often/in which configurations
> > the test is run - LLVM_ENABLE_EXPENSIVE_CHECKS presumably only works
> > for code within LLVM itself, and not test cases - but maybe I'm wrong
> > there & this parameter could be used (& then the timing bumped up
> > quite a bit to try to make it much more reliable), or something
> > similar could be implemented at the lit check level?
> >
> > Ah, compiler-rt tests use EXPENSIVE_CHECKS to disable certain tests:
> >
> > ./compiler-rt/test/lit.common.configured.in:set_default("expensive_checks",
> > @LLVM_ENABLE_EXPENSIVE_CHECKS_PYBOOL@)
> > ./compiler-rt/test/fuzzer/large.test:UNSUPPORTED: expensive_checks
> >
> > Could you bump the timeouts a fair bit and disable the tests except
> > under expensive checks?
>
> Only running that test in the "expensive_checks" configuration kind of
> abuses that configuration option because the test is actually not an
> expensive check.

Ah, sorry - my point was to make it expensive: Make it long enough
that it's more reliable even when the machine is under load. But
making it long makes it expensive, so then classify it as such.

> It's just that the test is sensitive to the
> performance of the running system. If the system is under heavy load
> when the test is running it's more likely to fail.
>
> One thing we could do to remove fragility in the test is to remove the
> running of `short.py` in the test. This is only invoked to check that
> it's possible for a command to run to completion in the presence of a
> fixed timeout. If we can live without testing that part (i.e. we only
> test that a timeout can be reached) then the test should be much more
> robust.

If you're on board with that, it's a tradeoff I think is probably
reasonable from a test coverage V reliability V development time
tradeoff.

> If for some reason that's not enough another thing that might help
> with this problem is to not run the lit tests at the same time as all
> the other tests. The hope here is that the lit tests would run while
> the system load is lower. We could do this by changing the build
> system to
>
> * Make all `check-*` (except `check-lit`) targets depend on the
> `check-lit` target. This would ensure that lit gets tested before we
> test everything else in LLVM.
> * Remove the lit test directory from inclusion in all `check-*`
> targets (except `check-lit`) so we don't test lit twice.
>
> This also has a nice benefit of testing lit **before** we use it to
> test everything else in LLVM. Today we don't do that, we just run all
> the tests in one big lit invocation (i.e. we test lit at the same time
> that we are using it). It does have some downsides though
>
> * If any of the lit tests fail we won't run the rest of the tests so
> you won't get the results of running the other tests until the lit
> tests are fixed.
> * When running any of the `check-*` targets (apart from `check-lit`)
> you have to wait for the lit tests to run before any other tests run.

Reckon that's probably not the best way to go - wouldn't totally fix
this bug (buildbot might be busy for other reasons (maybe it's running
some regular maintenance, etc) or lit tests might still saturate cores
& slow down the test) & would slow down test iteration by having
another "pinch point" in the build graph - waiting for the last of the
lit tests to run (this kind of period reduces core utilization - as
the last few tests finish and no new work is scheduled) before
starting up all the cores again to run the next batch of tests.

- Dave

>
> What do you think?
>
> > On Wed, Sep 16, 2020 at 9:31 PM Dan Liew <dan at su-root.co.uk> wrote:
> > >
> > > Hi David,
> > >
> > > Unfortunately writing a reliable test is tricky given that the
> > > functionality we're trying to test involves timing. I would advise
> > > against disabling the test entirely because it actually tests
> > > functionality that people use. I'd suggest bumping up the time limits.
> > > This is what I've done in the past. See
> > >
> > > commit 6dfcc78364fa3e8104d6e6634733863eb0bf4be8
> > > Author: Dan Liew <dan at su-root.co.uk>
> > > Date:   Tue May 22 15:06:29 2018 +0000
> > >
> > >     [lit] Try to make `shtest-timeout.py` test more reliable by using a
> > >     larger timeout value. This really isn't very good because it will
> > >     still be susceptible to machine performance.
> > >
> > >     While we are here also fix a bug in validation of
> > >     `maxIndividualTestTime` where previously it wasn't checked if the
> > >     type was an int.
> > >
> > >     rdar://problem/40221572
> > >
> > >     llvm-svn: 332987
> > >
> > > HTH,
> > > Dan.
> > >
> > > On Wed, 16 Sep 2020 at 09:37, David Blaikie <dblaikie at gmail.com> wrote:
> > > >
> > > > Ping on this
> > > >
> > > > On Wed, Sep 9, 2020 at 8:27 PM David Blaikie <dblaikie at gmail.com> wrote:
> > > > >
> > > > > The clang-cmake-armv8-lld (linaro-toolchain owners) buildbot is timing out trying to run some timeout tests (Dan Liew author):
> > > > >
> > > > > Pass: http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/5672
> > > > > Fail: http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/5673
> > > > >
> > > > > Is there anything we can do to the buildbot? Or the tests? (bump up the time limits or maybe remove the tests as unreliable?)


More information about the llvm-dev mailing list