[PATCH] D133172: [lit] Extend DEFINE/REDEFINE with function substitutions

Fri Sep 23 11:24:06 PDT 2022

jdenny added a comment.

In D133172#3811800 <https://reviews.llvm.org/D133172#3811800>, @jdenny wrote:

> I guess that's in the eye of the beholder.  Where possible, I tend to prefer formal definitions of syntax (like regexes) over ad-hoc implementations (like recursive-descent parsing), and I'm looking forward to a more powerful regex library in python.  However, I'm open to reimplementing this as a recursive-descent parser now if that's more acceptable to everyone.  Or maybe I should spend more time researching that new regex library to see if it's usable now somehow.

Arguing the other side, recursive-descent parsing can help to improve the diagnostics.  That's relevant here: where a function substitution is used with unbalanced parentheses or the wrong number of arguments, the current patch just doesn't perform the substitution.  The lit user experience would be better if lit instead terminated with a diagnostic.

In D133172#3811800 <https://reviews.llvm.org/D133172#3811800>, @jdenny wrote:

> The general substitution algorithm now is to iterate the list of substitutions and apply one at a time to the current RUN line.  When arriving at a function substitution `%{name}` in the list, if we were to use recursive-descent parsing, I imagine it would then search for `%{name}(` in the RUN line.  Upon finding it, it would then look for the closing `%)` while recursively balancing anything like `%{looks-like-another-func}(args%)` in the argument list.  Regardless of whether `%{looks-like-another-func}(` is actually a function substitution, the matching `%)` binds to it for the sake of processing the arguments of `%{name}`.  With your proposed syntax, I think the behavior would be analogous: regardless of whether `%{looks-like-another-func(` is actually a function substitution, the matching `)}` binds to it.

I just thought of a potentially problematic case, which is perhaps what you were after:

  %{real-func}( %{not-a-func}(expr) %)

`%{not-a-func}` might be a substitution but not a function substitution, and there happens to be a following parenthesized expression that would remain after expansion.   `%{not-a-func}(` can prevent `%{real-func}(...%)` from expanding because the one `%)` matches `%{not-a-func}(`.

That's not a big deal with the current patch because you just increase `recursiveExpansionLimit`, and `%{real-func)(...%)` will be sure to expand after `%{not-a-func}`.  In general, the current patch can require increasing `recursiveExpansionLimit` when something that looks like a call appears in an argument.

However, if we move to recursive-descent parsing and diagnose unbalanced parentheses, then this case can fail on the first attempt at expanding `%{real-func}(...%)`.  I noticed your proposed syntax suffers from the same problem, but I think it's only for cases that are probably less apt to occur in practice:

  %{real-func( %{not-a-func(foo)-bar} )}`

With either syntax, one way to solve this problem is to, upon each potential function use like `%{not-a-func}(`, do a substitution lookup to see if it's a function substitution.

A simpler way is to move to the other syntax I mentioned in my last comment:

  %{real-func}%( %{not-a-func}(expr) %)

That is, `%(...%)` would always indicate a function substitution use.  If you don't want it to, escape it using `%%`.  Lit could even diagnose leftover `%(...%)` that never matched a function substitution.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133172/new/

https://reviews.llvm.org/D133172