[PATCH] D150856: [lit] Add %{for-each-file} substitution

Thu Jun 22 10:47:08 PDT 2023

Endill added a comment.

In D150856#4436879 <https://reviews.llvm.org/D150856#4436879>, @jdenny wrote:

> What about the following instead?
>
>   // RUN: split-file %s %t
>   // PYTHON: for file in os.listdir(lit.substs['%t']):
>   // PYTHON:   lit.run('%clang_cc1 -std=c++98 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
>   // PYTHON:   lit.run('%clang_cc1 -std=c++11 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
>   // PYTHON:   lit.run('%clang_cc1 -std=c++14 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
>   // PYTHON:   lit.run('%clang_cc1 -std=c++17 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
>   // PYTHON:   lit.run('%clang_cc1 -std=c++20 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
>   // PYTHON:   lit.run('%clang_cc1 -std=c++23 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
>
> In other words, instead of incrementally designing, debating, and maintaining more and more lit features to try to address every use case in the most ideal manner, eventually ending up with a full blown scripting language in lit, why don't we invest that effort in connecting to an existing scripting language? Python seems like the obvious choice.  This would open so many more doors than one more lit control structure.

That's an excellent suggestion! Is any aspect of it a real lit API?

In D150856#4436942 <https://reviews.llvm.org/D150856#4436942>, @jhenderson wrote:

> I've been watching on the discussion a bit over the last few days, and soemthing along these lines is what I'm inclined to think would make sense. As this is a somewhat specialised case. I don't think having a substitution that could cause potential confusion (it expands the whole line and repeats it, rather than replacing a thing in situ only) is the way to go, but we can achieve the same thing via python. You don't even need @jdenny's addition to lit either: `%python` is an existing substitution to the python executable, allowing you to just execute any old script, which could either be a separate file, something inline in the RUN command, or a file "appended" to the test file and split off via `split-file`. Untested example:
>
>   // RUN: split-file %s %t
>   // RUN: %python doTest.py %t %clang_cc1
>   
>   //--- doTest.py
>   import os
>   import sys
>   
>   for f in os.listdir(sys.argv[1]):
>     if f.endswith('.py'):
>       continue
>     for std in ['c++98', 'c++11', 'c++14', 'c++17', 'c++20', 'c++23']:
>       subprocess.run(sys.argv[2] + ' -std=' + std + ' -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + f)
>   
>   // ... all the files for expansion by split-file ...

I did some real testing. Here's how it looks like: https://gist.github.com/Endilll/e2e4ace431489aa019682b96c331d928
Note that it requires D151320 <https://reviews.llvm.org/D151320> to work.

> If you made the python script a separate file listed alongside the tests, you could relatively easily adapt it to work with all the different test cases you need. The basic rule would be that a thing that would be substituted by lit instead becomes an input argument into the python script. If you want additional diagnostics about which case failed, you just need to print some markers in the script at the appropriate points (e.g. at the start of the loop, indicating the file being used etc).

We also would have to properly handle return codes from subprocesses. Then we shouldn't forget to forward stderr to stdout, otherwise we could miss some output. All in all, I don't think tests should duplicate test runner stuff inside lit that's been proven and working for us already.

> A minor potential benefit of this beyond @jdenny's suggestion is that you could execute the python script directly without using lit to aid with debugging, once you have a test that failed. This would avoid things like needing to re-split the file etc/you could delete all the uninteresting inputs to focus on one specific one etc.

I understand where you come from, but it's hard to beat debugging experience Compiler Explorer provides for DR testing. Compiler invocation is sufficient in common cases, and when crazy stuff <https://godbolt.org/z/M9ocnha48> happens, something this simple is not expected to be too helpful.

Thank you for pointing out how well lit and split-file can play together! I'm in favor of both Joel's PYTHON directive, and a form along the following lines:

  // RUN: rm -rf %t
  // RUN: split-file --leading-lines %s %t
  // RUN: %lit-generate-directives %t/run_dr6xx.py

  //--- run_dr6xx.py
  for file in os.listdir(lit.substs['%t']):
    lit.run('%clang_cc1 -std=c++98 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
    lit.run('%clang_cc1 -std=c++11 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
    lit.run('%clang_cc1 -std=c++14 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
    lit.run('%clang_cc1 -std=c++17 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
    lit.run('%clang_cc1 -std=c++20 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)
    lit.run('%clang_cc1 -std=c++23 -verify -fexceptions -fcxx-exceptions -pedantic-errors ' + file)

  //--- dr600.cpp
  ...

`%lit-generate-directives` is useful for "ordinary" DR tests that share same compiler flags. Eliminating differences in flags is on my list for DR tests. But there are also at least CodeGen tests which definitely can't use ordinary set of flags. Inline PYTHON directive could make more sense for them.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150856/new/

https://reviews.llvm.org/D150856