[libcxx-commits] [libcxx] [llvm] [openmp] [lit] Add a flag to disable lit time tests (PR #98270)

Fri Jul 12 00:23:23 PDT 2024

jh7370 wrote:

> > Orthogonal to the real issue here, but have you considered the additional overhead of spinning up lit per test? I would expect (but it's always possible I'm wrong) that spinning up lit 10 times to run 10 tests individually would be slower than spinning it up once to run them all.
> 
> It's possible that the lit test, itself, could be slower by having to spin it up N times for N tests, but the net result for us by using a distributed build system like buck should still be faster since we would've distributed these tests across multiple machines and not be limited by the cores on a single machine. If we had modeled our testing approach similar to lit's model of taking in a test suite, then we would be sacrificing flexibility of being able to run a single lit test (for those who may only want to debug the single test during development)

Thanks for the explanation, it makes sense. I didn't realise you were distributing the lit tests across multiple devices.

> > One random idea that I haven't really given much thought to how effective it would be (or even whether it would actually work) would be to avoid a single centralised file and instead have a file per test, that are read in following test discovery to form a combined database of sorts for determining order within that set of tests.
> 
> Yep this would be a potential solution. The problem I see here with this approach is the lit overhead for doing I/O just to write a timestamp for each file. To better illustrate, LLVM has ~60000 lit tests. Having to do I/O for ~60000 files might not scale and could end up further slowing down the current behavior (which wouldn't be ideal); whereas, disabling it via a flag would maintain the status quo.
> 
> I'm open to other potential forms of solution. And currently, disabling via a flag is the best compromise (even though it may seem like a workaround).

Ultimately, this is a form of database, so we could look into things like SQL etc, but I suspect that the overhead isn't worth it in this context, plus there'll always be the risk of contention accessing the database, which will be especially bad in your use-case, since you'd be accessing it once per test, rather than once per lit test suite. Given the right concurrent database solution, it might work reasonably well (i.e. something with the ability to read/write in parallel as long as the different threads don't access the same database entry). However, I'm not a database expert, so I can't say if such a system would be either fast or easily integratable into lit.

Side-note: on the assumption you don't have access to at least as many build threads/processes across all the machines as you have tests, you'll end up losing out on the benefit of the test times file by simply disabling it, i.e. you could end up with a long tail of a small number of very slow tests being executed right at the end of the test set.

https://github.com/llvm/llvm-project/pull/98270