[PATCH] D122569: [lit] Support %if ... %else syntax for RUN lines

Andrew Savonichev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 18 07:13:03 PDT 2022


asavonic added a comment.

In D122569#3446867 <https://reviews.llvm.org/D122569#3446867>, @arichardson wrote:

> In D122569#3442909 <https://reviews.llvm.org/D122569#3442909>, @asavonic wrote:
>
>> In D122569#3441717 <https://reviews.llvm.org/D122569#3441717>, @lebedev.ri wrote:
>>
>>> Please don't forget to also update the `llvm/utils/update_*_checks.py`.
>>
>> It seems that update_*_checks.py scripts are completely independent of lit.py, so there is no way to easily reuse code between them. Parsing of RUN lines and standard substitutions is duplicated there. We have to either integrate these scripts, or copy paste the code.
>
> I had some code in the downstream CHERI LLVM to expand substitutions using lit. The downside is that the update scripts need to know the path to the build dir to get all the lit config files, but if there is interest I'd be happy to upstream that change.

@arichardson, I think it makes sense as an optional feature. Right now there is no way to handle substitutions defined in lit.cfg or share the code with lit.py.



================
Comment at: llvm/utils/lit/lit/TestRunner.py:1213
+    # substitutions.
     def escape(ln):
+        return _caching_re_compile('%%').sub('#_MARKER0_#', ln)
----------------
jdenny wrote:
> Can we make the name more specific now that there's also an `escapeBraces`?  Maybe `escapePercents`?  That might be better even if `escapeBraces` goes away while addressing other issues.
Done. `escape` is now renamed to `escapePercents`.


================
Comment at: llvm/utils/lit/lit/TestRunner.py:1306
+
+        while len(ln):
+            # %if starts a new scope ('if' branch). Push '%if cond' to
----------------
jdenny wrote:
> Instead of a loop with a custom stack data structure, a recursive descent parser might be easier to understand and extend, especially if we add more constructs similar to `%if` later.  That is, the python call stack would be used as the parser stack, and stack entries would be simple local variables, function parameters, and function returns.
> 
> This is just a suggestion.  I'm not insisting.  However, I do invite other reviewers to offer their opinion.
Done. It does look better as a recursive function.


================
Comment at: llvm/utils/lit/lit/TestRunner.py:1363
             # seems reasonable.
             ln = _caching_re_compile(a).sub(str(b), escape(ln))
 
----------------
jdenny wrote:
> Why does `%%` need to be escaped again here?
This is because of recursive substitution:
```
substitutions = [
  ("%rec1, "%%s"),
  ("%s", "/path")
]
```
If we expand `%rec1` to `%%s` in the first iteration of the loop, we have to escape `%%` before the next iteration. Otherwise instead of of `%s` we'll get `%/path`. This is tested in `shtest-recursive-substitution/escaping` test.


================
Comment at: llvm/utils/lit/tests/Inputs/shtest-if-else/test.txt:75
+#
+# RUN: %if feature {echo %{%}}
+# CHECK-NEXT: {{^.*'RUN}}: at line [[#@LINE-1]]'; echo {}
----------------
jdenny wrote:
> If we're going to have these `%{` and `%}` escapes, they need to be documented somewhere.  However, I think they're going to be confusing:
> 
> - First, `%{foobar}` is already a common syntax for lit substitutions.  The if-then-else syntax's `%{` and `}` collide with it.  For example, how should I write `%{pathsep}` in a then or else block?  I have to write it as `%%{pathsep%}`, right?
> - More generally, whether `{` and `}` need to be escaped and whether `%{` and `%}` are defined depend on the context.  That seems error-prone.
> 
> What if we go the other way?  That is, we could change the `{` and `}` around a then or else block to `%{` and `%}`.  That way, the entire if-then-else syntax is clearly at the preprocessor/substitution level: `%if`, `%else`, `%{`, and `%}`.  On the other hand, `{` and `}` behave as usual in the shell code, and the only character in shell code that ever needs to be escaped continues to be `%`.
> 
> But does that syntax have collisions?
> 
> - I just tried `git grep 'RUN.*%}'` and found no matches in llvm's repo.  It seems `%}` is unambiguous.  If a use arises, it can be escaped as `%%}`.  Again, only `%` ever has to be escaped.
> - The `%{` is really just for symmetry with `%}`.  The `%{` doesn't actually need to be recognized in arbitrary shell code anywhere.  That is, the if-then-else parser would only need to recognize it after an `%if cond` or `%else`.  There doesn't appear to be a way for it to collide with, for example, `%{pathsep}`.
> 
> Did I miss anything?
Done. The patch now uses `%{` and `%}` as begin/end tokens. It also helps to distinguish them from boolean expressions with a regex-match (`{{regex}}`).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122569/new/

https://reviews.llvm.org/D122569



More information about the llvm-commits mailing list