[PATCH] D154987: [lit] Implement PYTHON directive and config.prologue

Tue Jul 25 06:33:16 PDT 2023

sammccall added a comment.

In D154987#4530187 <https://reviews.llvm.org/D154987#4530187>, @jdenny wrote:

> In D154987#4529228 <https://reviews.llvm.org/D154987#4529228>, @sammccall wrote:
>
>> Almost all lit features have a python equivalent: CHECK vs check(),
>
> How would a `check()` work?

Sorry, this was an incomplete thought I should have revised.
I'm expecting we'll end up with some way of writing assertions directly within python, rather than just calling out to shell commands that verify themselves.
(If this isn't added deliberately, then I'd expect it'll turn up anyway via `raise` or `lit.run("false")` or something)

>> DEFINEs vs variables, `%if` vs `if` etc.
>
> My hope was that people would gradually migrate away from lit constructs like `%if` and REDEFINE, which some people find ugly to use and complex in lit's implementation.

I think gradual migration is a recipe for getting stuck in a halfway state. I agree with the ugliness and complexity, and wish these features had also not been added. The mixture is yet more complicated.

> Maybe, but most tools/languages eventually grow that issue.  An intended advantage is that people would be reading python instead of new add-ons for lit's own scripting language, which LLVM developers (understandably) keep wanting to propose.  That is, a lit with PYTHON should make tests easier to read and should make lit easier to maintain than a lit with its own evolving scripting language.

The difficulty in extending lit cleanly is a good reason for an alternative, and I think python is a good choice. (just not embedded in lit)

>> It's possible to process lit shell tests with alternate implementations (without python!), I know of at least two...
>
> I don't think I've heard about this.  Can you explain more?

Google uses a custom lit test runner to run LLVM tests as part of CI.
This has different performance characteristics and fits better into the infrastructure (each test is run as a separate sandboxed action in a distributed build/test environment).
This is possible because lit is a spec as much as a limitation (though recent feature additions have strained this).
The main limitation is that the config definition & discovery scheme is unsuitable for multiple reasons, so config needs to be configured separately.

The current iteration of this system implements a shell test interpreter as a Go binary (gtests are run in a different way).
The previous iteration mangled tests into shell scripts through regexps.

This is downstream and LLVM doesn't have any obligation to support any of it, but the ability to be integrated into different environments is a useful LLVM feature, and having clear interfaces for infra allows it to evolve in ways other than bolting on more features.

>> Python is a serious general-purpose programming language with tools available, but none of these tools work with python embedded in lit tests (not even syntax highlighting!).
>
> That sounds like a nice integration to work on... instead of continuing to grow lit's own scripting language.

Agreed. Realistically, none of the standard tools for editing python code will ever work if the code is prefixed with `PYTHON: ` lines and lives in `*.cpp` files, though.

>> The lit test runner supports having multiple types of tests (this is how we run gtest + shell tests).
>> Could the tests you're targeting be written as actual python files, with an appropriate library of operations? (Some of this library could of course be shared with the shell test implementation).
>
> Are you suggesting that, instead of embedding python inside C/Fortran/IR/whatever, people should embed C/Fortran/IR/whatever inside python?

I do like the trick where a single lit test file provides multiple kinds of inputs (e.g. to lit + clang + FileCheck), but it is hard to reason about, and would not like to see it used when the test contains control flow.
I'm not sure whether it's best to have C etc inputs as string literals, or as separate files.

String literals are workable, because python has multiline raw strings (vs `PYTHON: ` prefix), and the embedded C programs tend to be self-contained (vs interacting with e.g. `lit`).
So readability & tools are a bit better as python-in-lit, and the need for tools isn't so strong.
This is the approach we mostly used in clangd (e.g. llvm-project/clang-tools-extra/clangd/unittests/FindTargetTests.cpp)

Separate files for the C++ inputs are in some ways a more principled option, and tools work well, but more hassle to organize and navigate.

> Either way, libraries can be developed to minimize the complexity of python in the individual tests.  However, I think the latter would create greater inconsistency among lit tests than the former: just to add a loop to a shell command in an existing lit test, you would have to reorganize the entire test.

FWIW, this seems like a feature to me: you have to choose between terse/cryptic/self-referential lit test syntax, and powerful logic. (Hiding the logic in libraries makes it both better and worse!)
Having to rewrite the test is work, but starting with spaghetti and adding structure *without* rewriting may be worse...

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154987/new/

https://reviews.llvm.org/D154987