[llvm-dev] distributed lit testing

Fri Mar 12 13:31:29 PST 2021

I also have
https://github.com/nico/llvm-project/commit/7246393c6bbc270044641415ffb0db93ffee3e29
in a local branch, which makes it possible and easy to zip up all build
artifacts and test inputs needed to run tests on a remote machine.

With this, you can run check-llvm, check-clang, check-lld etc in parallel
(sharded per test suite too) -- but you're limited by your uplink speed.

(Also, assumes GN build, but the idea should transfer to cmake fine.)

On Fri, Mar 12, 2021 at 2:41 PM Petr Hosek <phosek at google.com> wrote:

> Thank you for sharing your experience Sam! I'd be interested in taking a
> look at your test runner if it's something you could publish.
>
> I started looking into this topic recently since we're now looking into a
> way to run lit tests on Fuchsia. I started experimenting with the remote
> execution support in libc++ but using SCP and SSH for each test doesn't
> really scale.
>
> In Fuchsia, the unit of distribution is a package that's completely
> hermetic. We then run these packages as components, where each component
> has its own filesystem and doesn't have any unnecessary privileges. It's
> similar to containers in many ways.
>
> The idea I got was to extend lit to separate configuration from execution,
> which would allow us to package up all tests on the host, push them to the
> target and run each of them as a separate component using our test runner
> (we already have a Fuchsia test runner that runs tests as components).
>
> It sounds very similar to what you already did and I'd be interested in
> seeing if we could reuse some of your tooling. Furthermore, it'd be great
> if we could come up with a way to support this workflow directly in lit and
> LLVM.
>
> Nico also looked into this area in the past, experimenting with a custom
> test runner written in Go (github.com/nico/glitch) and using Ninja as a
> test runner (reviews.llvm.org/D47506) which may be worth checking.
>
> On Fri, Mar 12, 2021 at 6:56 AM Sam McCall via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi James,
>>
>> We run lit tests at Google using a custom runner on a distributed build
>> system similar to Bazel.
>> In particular we run most of the llvm-project tests both when pulling in
>> upstream revisions, and for any change to our internal repository that
>> touches nearby files.
>>
>> I wanted to share some of our experiences in case they're useful, and in
>> the hope that this project may result in something we can use too :-)
>> I'm being brief here, but happy to provide more details.
>>
>> Our build system wants to run each test in isolation (separate process,
>> sandboxed).
>> Making each test hermetic separates concerns nicely (the same distributed
>> runner is used for all kinds of testing, not just lit).
>> This model is also easier to fit into other containers (e.g. I imagine
>> Ninja could make a good local test driver).
>> Compared to e.g. a custom driver that talks to a custom worker server
>> that runs many tests per subprocess... there's not very much of that we
>> would be able to reuse.
>> I know there are OSS Bazel projects that want to run lit tests that would
>> struggle with this model too.
>>
>> The biggest problem with using the standard lit tool for hermetic tests
>> is it was too slow to start to run a single test.
>> Fundamentally the slow parts are the config system, and init of python
>> programs.
>>
>> We had a greatly simplified time with the config system, because test
>> (mostly) in a single config, so we could flatten it out into a list of
>> features and substitutions.
>> But in a more general system, if we can produce the config data from
>> config logic as a *build* step, then it can be cached in the usual way and
>> simply fed into each test.
>> You'll need to untangle config specific to the machine running the test
>> from config specific to the machine driving the tests.
>>
>> I wrote a hermetic test runner in Go - not my favorite language but it
>> starts up fast and has good subprocess support.
>> It's greatly simplifying to be able to assume you can fork a real shell
>> and only limited state (CWD, exported vars) can leak from one RUN line to
>> the next, this works fine for us in practice (but we don't test on windows).
>> It has some nice features like printing a transcript of the test run,
>> highlighting directives and stderr output, showing pre/post expansion
>> lines, annotating each line with the result.
>> I should be able to share the code of this, it's nothing terribly
>> surprising.
>> It's less than 1000LOC and runs almost all LLVM tests - IMO it would be
>> worthwhile to keep the lit spec very simple and removing some of the
>> marginal features that have crept in over the years. We chose to simply
>> drop some tests rather than deal with all the corners.
>> (Before this existed, we ran sed over the lit tests to turn them into
>> shell scripts, which worked but was hard to maintain and to read the output
>> on failure... actually the upstream lit runner has the latter problem too!)
>>
>> I'm sure I've forgotten things, but I think those were my biggest
>> takeaways. Needing to solve the config problem + the go dependency were the
>> main reasons I didn't push to make these changes upstream :-(
>> Hope this is useful or maybe at least interesting :-)
>>
>> Cheers, Sam
>>
>> On Wed, Feb 24, 2021 at 9:54 AM James Henderson via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi Victor,
>>>
>>> The lit test framework is the main testing framework used by LLVM. You
>>> can find the source code for it in the LLVM github repository (see in
>>> particular https://github.com/llvm/llvm-project/tree/main/llvm/utils/lit),
>>> and there is documentation available for it on the LLVM website -
>>> https://llvm.org/docs/TestingGuide.html gives the high-level picture of
>>> how LLVM is tested, whilst https://llvm.org/docs/CommandGuide/lit.htmlis
>>> more focused on lit specifically.
>>>
>>> Examples of where lit is used include the individual test files located
>>> in places like llvm/test, clang/test and lld/test within the github tree.
>>> These test directories include additional configuration files, some of
>>> which are configured when CMake is used to generate the build files for the
>>> LLVM project. If you aren't already familiar with LLVM, I highly recommend
>>> reading up on https://llvm.org/docs/GettingStarted.html, and following
>>> the steps to make sure you can build and run LLVM components locally.
>>>
>>> Lit works as a python process which spawns many child processes, each of
>>> which runs one or more of the tests located in the directory under test.
>>> These tests typically are a sequence of commands that use components of
>>> LLVM that have already been built. You can build the test dependencies and
>>> run the tests by building one of the CMake-generated targets called check-*
>>> (where * might be llvm, lld, clang, etc to run a test subset or "check-all"
>>> to run all known tests. Currently, the tests run in parallel on the user's
>>> machine, using the python multiprocessing library to do this. There also
>>> exists the --num-shards and related options which allows multiple computers
>>> to each run a subset of the tests. I am not too familiar on how this option
>>> is used in practice, but I believe it requires the computers to all have
>>> access to some shared filesystem which contains the tests and build
>>> artifacts, or to each have the same version checked out and to have been
>>> sent the full set of build artifacts to use. Others on this list might be
>>> able to clarify further.
>>>
>>> The project goal is to provide a framework for distributing these tests
>>> across multiple computers in a more flexible manner than the existing
>>> sharding mechanism. I can think of two different high-level options -
>>> either a layer on top of lit which uses the existing sharding mechanism
>>> somehow, or something built into the existing lit code that goes wide with
>>> the tests across the machines. It would be up to you to identify and
>>> implement a way forward doing this. The hope would be that this framework
>>> could be used for multiple different distributed systems, as described in
>>> the original project description on the Open Projects page.
>>>
>>> This project is intended to be a possible Google Summer of Code project.
>>> As such, to participate in it, you'd need to sign up on the GSOC website,
>>> and provide a project proposal there which details how you plan to solve
>>> the challenge. It would help your proposal get accepted if you can show
>>> some understanding of the lit testsuite, and some evidence of contributions
>>> to LLVM (perhaps in the form of additional testing you might identify that
>>> is missing in some tests, or by fixing one or more bugs from the LLVM
>>> bugzilla page, perhaps labelled with the "beginner" keyword). I am happy to
>>> work with you on your proposal if you are uncertain about anything, but the
>>> core of the proposal needs to come from you.
>>>
>>> I hope that gives you the information you are looking for. Please feel
>>> free to ask any further questions that you may have.
>>>
>>> James
>>>
>>> On Tue, 23 Feb 2021 at 17:28, Victor Kukshiev via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hello I am Victor Kukshiev (cetjs2 in IRC), 2rd course student of
>>>> PetrSU university.
>>>> Distributed lit testing idea is interested and possible for me, I think.
>>>> Could you tell us more about this project?
>>>> What is lit test suite?
>>>> I know python  language.
>>>> What do I participate in thiis project?
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/2a35c59b/attachment.html>