[PATCH] D76178: [lit] Recursively apply substitutions

Tue Mar 17 19:28:00 PDT 2020

delcypher added a comment.

@ldionne

> In D76178#1922990 <https://reviews.llvm.org/D76178#1922990>, @delcypher wrote:
> 
>> While I can see the appeal of this I need convincing that this actually needed. I see too many downsides
>>
>> - Possibility of non-termination being introduced that we'll have to check for which is a cost that will have to be paid every time we execute a test.
> 
> 
> I agree we should check for this, however it's not clear to me this is going to make a noticeable perf difference in practice. The tests we run involve executing processes and doing things that are a lot more expensive than the simple lit substitution, even if we check for  non-terminating substitutions. I think it should be possible to make it fairly quick in the usual case where full substitution happens after exactly one substitution.

Testing **only** for the "usual case" isn't sufficient to cover all situations where infinite recursion might be introduced. It looks lit actually supports regex substitutions (which I didn't even know it supported). That makes checking for cycles even harder. Maybe a low default recursion depth would be the solution here?

>> - Debugging lit substitutions becomes really difficult. Right now I can add a few print statements in the lit.site.cfg files to print out the current substitutions (or ones that interest me) to see what they expand to. With this change that doesn't work anymore because the full expansion happens much later.
> 
> I'm not sure I understand. Full expansion after this patch happens at the same point where it does before the patch. The only difference is that when you print substitutions in the `lit.site.cfg`, you must mentally substitute anything that would normally get re-substituted with this patch. However, if you don't have substitutions that expand into other substitutions, this is moot -- the behavior is the same as today.

Let me give you a concrete example to illustrate the point. You can add a  line like this to a lit config

  import pprint
  lit_config.fatal('Substitutions: {}'.format(pprint.pformat(config.substitutions)))

For ASan's x86_64 test suite for `%clang` you'll see something like this printed out.

  ('%clang ',
    '   /Volumes/data/dev/llvm/monorepo_upstream/master/builds/RelWithDebInfo/./bin/clang   -arch x86_64 -stdlib=libc++ -mmacosx-version-min=10.10 -isysroot /Volumes/data/xc/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk  ')

Doing this is a very useful way to inspect what `%clang` will expand to.

Your change will break this because now people can do something like this in the their lit config.

  config.substitutions.add( ('%clang', '%cc %arch %warnings %sysroot'))

and so when you try to inspect the substitution in the `lit.cfg`  using the technique I described above we would see.

  ('%clang', '%cc %arch %warnings %sysroot')

Printing the above out isn't anywhere near as useful as the full expanded version that we get today. One way I see around that is providing a convenience API that can be called in lit.cfg files that can expand an arbitrary string based on the current set of substitutions.

>> The way most test suites handle achieving your end goal is to just have variables that represent the different pieces of a command line and then define the substitution appropriately. E.g.
>> 
>> [...]
>> 
>> While not quite as elegant as what this patch proposes it is so much easier to debug. Are you trying to do something that is difficult to express in the `lit.site.cfg` files?
> 
> Yes, that's indeed what libc++ does. It actually ends up being quite contrived in libc++, and we have a couple substitutions that would benefit from this.

And the full power of python isn't enough to come up with some nice syntactic sugar?

> The larger picture of why I got to making this change is that I'm trying to rewrite the libc++/lit integration entirely to make it simpler and more flexible. The way I've done that so far is by defining the libc++ test format right on top of `executeShTest`, where my test format only expects a few substitutions to be provided in the config object (`%cxx`, `%compile_flags`, and a few others), and everything else is defined on top of those substitutions. For example, a `.pass.cpp` test is one with a script of two steps:
> 
> - `%cxx %compile_flags %link_flags -o %t.exe %s`
> - `%exec %t.exe`

Side note: If you're rewriting the test suite now would be a good opportunity to avoid a mistake made in the past. Lit substitutions don't handle the case where a substitution is a prefix of another (e.g. `%clang` and `%clang_asan`, the resolution is dependent on the order the substitutions are defined). You could avoid this by writing your substitutions with an end delimiter  (e.g. `%clang%` and  `%clang_asan%`).

> The benefit of approaching it this way is that it's as flexible as what we have today, however it's incredibly easy to extend by tweaking what substitutions you provide in the `lit.cfg`. For example, if you want a custom executor, just define the `%exec` substitution to run your custom script, and you don't need to write a hundred lines of Python code that calls `subprocess(...)`. Where recursively expanding substitutions come into the picture is that I define `%build` on top of `%cxx` (and others), and I don't have access to the actual thing that `%cxx` will be substituted for at the point where I define `%build`. In other words, I'm parameterizing substitutions on top of a set of base substitutions. I've found this to be very powerful and flexible so far.

Why do you need to create your own test format to achieve your goals? Everything you've described so far is achievable with lit and a suitable `lit.cfg` file today.

One use case I could imagine is the case where you want to run similar variants of the same test by expanding a substitution differently (e.g. same test script except do one at `-O0` and one at `-O3`). This is something lit is not good at and I imagine your change could be a useful stepping stone to fix this.

> @delcypher 
>  Is there an interest for moving forward with this change if we check for non-termination without incurring unreasonable cost, or are you not convinced by my use case?

I'm not entirely convinced by your use case but I do see the appeal of the feature. What I want is to make sure this feature is not added in a form where

- It's easy to shoot yourself in the foot (with infinite recursion).
- It's very difficult to debug when things go wrong.
- The feature isn't documented.

If you can fix those problems then I'm not opposed to this. My suggestions would be.

1. Have a recursion limit for substitution expansion. This could be a property of the test suite and could have a low default. If we hit the recursion limit and a further expansion would return a different string ("one more expansion check") then we should fail the test. You might need a new test failure type here. If we made the default recursion limit `0` then we could special case this by not doing the "one more expansion check". This would actually mean that only test suites that opt into recursive lit substitution expansion have to pay the cost to perform checks.
2. Introduce an api usable from `lit.cfg` files that can expand substitutions recursively. Include a test that actually uses it.
3. If time permits some documentation would be useful. Failing that proper tests that exercise everything you're adding will have to do.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76178/new/

https://reviews.llvm.org/D76178