[PATCH] D133172: [lit] Extend DEFINE/REDEFINE with function substitutions

Fri Sep 23 01:40:09 PDT 2022

jhenderson added a comment.

In D133172#3809343 <https://reviews.llvm.org/D133172#3809343>, @jdenny wrote:

> In D133172#3807800 <https://reviews.llvm.org/D133172#3807800>, @jhenderson wrote:
>
>> 
>
> Thanks for taking a look.
>
>> I'll be honest, I don't like the syntax of this,
>
> I get it.  It's not pretty.  However, it was the simplest syntax I came up with to address the use cases I had in mind.  I figured there would be a syntax debate, and I am open to suggestions.  (For reference, @MaskRay and I also discussed the syntax some starting at https://discourse.llvm.org/t/lit-run-a-run-line-multiple-times-with-different-replacements/64932/14.)
>
>> so much so that I think I'd prefer people use the existing method of providing function arguments for these substitutions (i.e. DEFINE then REDEFINE them as needed) versus this.
>
> For some uses cases, the existing DEFINE/REDEFINE functionality is sufficient and preferable.  However, as the new `TestingGuide.rst` entry in this patch discusses, that syntax is not sufficient for some other uses cases.

I see it. I didn't register the need on my first skim through this, but I think I get it on second look.

>> The complex regex required in the lit scripts to make this format work also doesn't help this solution's case.
>
> I'm not sure what you mean by "the lit scripts".  Are you referring to `TestRunner.py`?  Or are you thinking it will show up again somewhere else?

No, TestRunner.py was what I was referring to.

>> Is the following format at all viable?
>>
>>   %{func(arg1, arg2)}
>
> I tried something similar originally:
>
>   %{func}(arg1, arg2)
>
> However, I quickly ran into many cases where I needed `,` and `)` in my actual arguments.  For example, an argument might be a comma-delimited list of FileCheck/-verify prefixes.  An argument might contain expected output strings that contained parenthesized elements, such as `(null)`.  The above suggestion has the same problems.
>
> Ultimately, I settled on `%,` and `%)` as delimiters.  That way, there’s no limit on what can go in an actual argument.  Moreover, they stick with `%` for introducing new special sequences. The `%if` syntax ultimately chose to stick with `%` as well.

The %if syntax is slightly different, because there's no way of "grouping" the whole thing into one "name", I think? (I haven't looked back at the syntax for it, so might be misremembering it). For the DEFINE/REDEFINE case, a format with `%{foo(arg1, arg2)}` has the closing `}` as a convenient terminator for the "pattern", so anything (e.g. `,` and `)`) in between could be considered special (just don't try to use a `}` in the arg list!).

One of my main concerns is that the argument portion of the proposed syntax doesn't necessarily look like it really is a set of arguments to the untrained eye. The `%)` maybe just about binds itself to the `(`, but none of it really binds itself to the `%{name}` part, so it just doesn't feel like the two halves of the whole expression have anything to do with each other.

>> This would require lit to treat these kinds of substitutions specially,
>
> I'm not sure I know what you mean.
>
> Do you mean that, where we currently use python's `re.sub` and a regex to apply substitutions, we would, for the case of function substitutions, extend python with a hand-written parser for function argument lists?  Given the previous negative reaction to making lit more complex for `%if`, I figured that expressing the argument list syntax in a regex that can be used within lit's existing implementation for applying substitutions would be more acceptable.  I'm not sure that regex is actually more complicated than a hand-written parser would be.  I'm pretty sure the regex is at least more concise.  In any case, I'm open to rewriting if that's the direction the majority want.  Eventually, we'll reinvent m4.

I think I might have misremembered how these substitutions are handled in practice, but in essence, I imagined doing (effectively) `re.search` to find a substitution in the input string, and swapping it in, repeating the process for each substitution. As the `%{name(arg1, arg2)}` syntax wouldn't work for that, you'd have to do something different or extra for these cases. Perhaps it would be `re.search` for the `%{name` part and then find the next `}` manually to form the "matching" block. It may not be less complex in terms of overall code, but it seems like it would require a much simpler regex, therefore making the code perhaps easier to understand?

> Also, I don't think the possibility of implementing with a hand-written parser vs. a regex is specific to your proposed syntax.  Either is also possible for the syntax I proposed in this review.
>
> But maybe I misunderstood your point.
>
>> but the format including `(` and `)` in the name means this would at least be fairly easy to detect.
>
> I don't know what you mean.  Can you explain more?

I was thinking that you'd be able to distinguish the arg-based substitutions from regular ones, by identifying the presence of `(` in the pattern, and then could decide to do a different thing for that substitution.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133172/new/

https://reviews.llvm.org/D133172