[PATCH] D122569: [lit] Support %if ... %else syntax for RUN lines

Joel E. Denny via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 12 10:47:46 PDT 2022


jdenny added a comment.

@asavonic: Thanks for continuing to work on this feature!

In D122569#3442909 <https://reviews.llvm.org/D122569#3442909>, @asavonic wrote:

> In D122569#3441717 <https://reviews.llvm.org/D122569#3441717>, @lebedev.ri wrote:
>
>> Please don't forget to also update the `llvm/utils/update_*_checks.py`.
>
> It seems that update_*_checks.py scripts are completely independent of lit.py, so there is no way to easily reuse code between them. Parsing of RUN lines and standard substitutions is duplicated there. We have to either integrate these scripts, or copy paste the code.

Has anyone said they need `%if` in those scripts?  Upon seeing `%if` in a `RUN` line, those scripts could probably be adjusted to simply complain that `%if` is not yet supported there.



================
Comment at: llvm/utils/lit/lit/TestRunner.py:1213
+    # substitutions.
     def escape(ln):
+        return _caching_re_compile('%%').sub('#_MARKER0_#', ln)
----------------
Can we make the name more specific now that there's also an `escapeBraces`?  Maybe `escapePercents`?  That might be better even if `escapeBraces` goes away while addressing other issues.


================
Comment at: llvm/utils/lit/lit/TestRunner.py:1274
+            result = ''
+            # expr is either or a plain string or an '%if..%else'
+            # expression (a list); evaluate and concatenate all items
----------------



================
Comment at: llvm/utils/lit/lit/TestRunner.py:1306
+
+        while len(ln):
+            # %if starts a new scope ('if' branch). Push '%if cond' to
----------------
Instead of a loop with a custom stack data structure, a recursive descent parser might be easier to understand and extend, especially if we add more constructs similar to `%if` later.  That is, the python call stack would be used as the parser stack, and stack entries would be simple local variables, function parameters, and function returns.

This is just a suggestion.  I'm not insisting.  However, I do invite other reviewers to offer their opinion.


================
Comment at: llvm/utils/lit/lit/TestRunner.py:1346
+        if len(stack) != 1:
+            raise ValueError("unballanced '%if' expession")
+        ln = evalExprList(stack.pop())
----------------



================
Comment at: llvm/utils/lit/lit/TestRunner.py:1363
             # seems reasonable.
             ln = _caching_re_compile(a).sub(str(b), escape(ln))
 
----------------
Why does `%%` need to be escaped again here?


================
Comment at: llvm/utils/lit/tests/Inputs/shtest-if-else/test.txt:75
+#
+# RUN: %if feature {echo %{%}}
+# CHECK-NEXT: {{^.*'RUN}}: at line [[#@LINE-1]]'; echo {}
----------------
If we're going to have these `%{` and `%}` escapes, they need to be documented somewhere.  However, I think they're going to be confusing:

- First, `%{foobar}` is already a common syntax for lit substitutions.  The if-then-else syntax's `%{` and `}` collide with it.  For example, how should I write `%{pathsep}` in a then or else block?  I have to write it as `%%{pathsep%}`, right?
- More generally, whether `{` and `}` need to be escaped and whether `%{` and `%}` are defined depend on the context.  That seems error-prone.

What if we go the other way?  That is, we could change the `{` and `}` around a then or else block to `%{` and `%}`.  That way, the entire if-then-else syntax is clearly at the preprocessor/substitution level: `%if`, `%else`, `%{`, and `%}`.  On the other hand, `{` and `}` behave as usual in the shell code, and the only character in shell code that ever needs to be escaped continues to be `%`.

But does that syntax have collisions?

- I just tried `git grep 'RUN.*%}'` and found no matches in llvm's repo.  It seems `%}` is unambiguous.  If a use arises, it can be escaped as `%%}`.  Again, only `%` ever has to be escaped.
- The `%{` is really just for symmetry with `%}`.  The `%{` doesn't actually need to be recognized in arbitrary shell code anywhere.  That is, the if-then-else parser would only need to recognize it after an `%if cond` or `%else`.  There doesn't appear to be a way for it to collide with, for example, `%{pathsep}`.

Did I miss anything?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122569/new/

https://reviews.llvm.org/D122569



More information about the llvm-commits mailing list