[PATCH] D133172: [lit] Extend DEFINE/REDEFINE with function substitutions

Mon Sep 26 00:01:32 PDT 2022

jhenderson added a comment.

In D133172#3811800 <https://reviews.llvm.org/D133172#3811800>, @jdenny wrote:

> OK, I see the idea now.  However, that adds a third character that cannot be used in arguments.

This is true. However, it seems like any solution has a restriction along these lines, unless you provide some form of escaping. In the current solution, you've effectively done this escaping via the use of `%)`. I don't know whether this is a good idea, but in my suggestion, you could potentially use `%` to escape the special characters to actually appear as literals in the input arguments. However, I'm also inclined to say we don't need to try to solve this problem up-front necessarily.

> To help `(` and `%)` seem to bind better, we could change them to `%(` and `%)`.  To rationalize how the argument list then binds to `%{name}`, compare `name(args)` in a C-like programming language to `%{name}%(args%)`.  That is, `name` -> `%{name}`, `(` -> `%(`, and `)` -> `%)`.  Notice that `%` is used to introduce every special sequence.
>
> I considered `%(` and `%)` early on, but it seemed more cluttered than technically necessary, so I decided to think of `%{name}(` as a single token instead.  I don't care that much though, so I'm fine to switch to `%(` if people like it better.

I'm not sure that would help the situation. The full syntax would look like this: `%{name}%(arg1, arg2%)`. By squinting, I can get the `%(` and `%)` to marry up, but I don't see the name half pairing up with the arg part intuitively. I don't know if this is necessarily a good idea overall, but I'd almost find it easier if the syntax were `%{name}(arg1, arg2)%` with the two `%` characters providing a "paired grouping" much like parentheses already do in typical computer programming. I think my problem is that the `%` characters in the current syntax aren't symmetrical about the midpoint of the syntax.

> It sound like you're trying to handle the call-in-args part of this, which requires a parser stack or `recursiveExpansionLimit`, as discussed above.  In that regard, I see no difference between the syntax proposed in the current patch and the `%{name(arg1, arg2)}` syntax.

Actually, I wasn't looking for nested functions: I think they are likely to be far less useful than function arguments, which I think are already likely to be relatively rare. I also think there's an argument for not adding too much complexity to the lit command language, because if you make the test script too complex, then it becomes impossible to figure out what the test is testing. I don't think regular function-like subsitutions fall into this situation, but if we start doing function calls in arg lists, then I think it might start becoming rather unintuitive.

>> It may not be less complex in terms of overall code, but it seems like it would require a much simpler regex, therefore making the code perhaps easier to understand?
>
> I guess that's in the eye of the beholder.  Where possible, I tend to prefer formal definitions of syntax (like regexes) over ad-hoc implementations (like recursive-descent parsing), and I'm looking forward to a more powerful regex library in python.  However, I'm open to reimplementing this as a recursive-descent parser now if that's more acceptable to everyone.  Or maybe I should spend more time researching that new regex library to see if it's usable now somehow.

I'm just one opinion, but I believe that one of the problems of regexes is that they require a decent knowledge of that particular language's regex language, since every variant is somewhat different, especially when getting into the more complicated functionality. Contrast that to a (hopefully simple) parser, if someone is reading this code, it's likely they already know python, so it should be fairly simple to follow.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133172/new/

https://reviews.llvm.org/D133172