<div dir="ltr"><div>Hi Michael,</div><div><br></div><div></div>You propose using the preprocessor for mixing C++ test code with its input code, such as LLVM IR.  Of course, LIT, FileCheck, `clang -verify`, and unit tests all enable mixing their various forms of test code with input code.  To my eyes, a difference between unit tests vs. LIT, FileCheck, and `clang -verify` tests is that the latter tend to make the input code more prominent (I prefer that) and make it easier to clarify which test code is associated with which input code.  So far, I think the preprocessor approach is better in this regard than unit tests, but what about a more familiar syntax, like the following?<br><div><br></div><div>```<br>// RUN-CXX: #include "compiledtestboilerplate.h"<br>// RUN-CXX: unique_ptr<Module> Output = run_opt("%s", "-passes=loop-vectorize");<br><br>// RUN-CXX: /* Check func1 Output */<br>define void @func1() {<br>entry:<br>  ret<br>}<br><br>// RUN-CXX: /* Check func2 Output */<br>define void @func2() {<br>entry:<br>  ret<br>}<br>```</div><div></div><div><div><br></div><div>It seems it should be feasible to automate extraction of the `RUN-CXX:` code for compilation and for analysis by clang-format and clang-tidy.  Perhaps there would be a new script that extracts at build time for all such uses.  But there are other possibilities that could be considered: a LIT extension for `RUN-CXX:`, C++ JIT compilation, a clang-format-diff.py extension that greps modified files for `RUN-CXX:`, etc.</div></div><div><br></div><div>LIT, FileCheck, and `RUN-CXX:` directives should then be able to co-exist in a single test file.  Thus, you might incrementally add `RUN-CXX:` directives to test files that already contain LIT and FileCheck directives to handle cases where FileCheck directives are difficult to use.  You could keep the FileCheck directives when they are more reasonable, or you could eventually replace them.  You might run `opt` once with a `RUN:` directive and then check its output `.ll` file with both `RUN-CXX:` and FileCheck directives (maybe all `RUN:` directives execute before all `RUN-CXX:` directives, or maybe C++ JIT compilation would permit them to execute in the order specified).<br></div><div><br></div><div>It's not clear to me whether the above idea is worth the trouble, but I think I'd at least prefer the syntax to the preprocessor approach.</div><div><br></div><div>I also have a vague feeling that something like this has been discussed before.  If so, please just point me to the discussion.</div><div><br></div><div>In any case, thanks for working to improve LLVM's testing infrastructure.<br></div><div><br></div><div>Joel<br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jun 23, 2020 at 9:33 PM Michael Kruse via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello LLVM community,<br>

<br>

For testing IR passes, LLVM currently has two kinds of tests:<br>

 1. regression tests (in llvm/test); .ll files invoking opt, and<br>

matching its text output using FileCheck.<br>

 2. unittests (in llvm/unittests); Google tests containing the IR as a<br>

string, constructing a pass pipeline, and inspecting the output using<br>

code.<br>

<br>

I propose to add an additional kind of test, which I call "compiled<br>

regression test", combining the advantages of the two. A test is a<br>

single .cxx file of the general structure below that can be dumped<br>

into the llvm/test directory. I am not proposing to replace FileCheck,<br>

but in a lot of cases, domain-specific verifiers can be more powerful<br>

(e.g. verify-uselistorder or `clang -verify`).<br>

<br>

    #ifdef IR<br>

      define void @func() {<br>

      entry:<br>

        ret<br>

      }<br>

    #else /* IR */<br>

      #include "compiledtestboilerplate.h"<br>

      TEST(TestSuiteName, TestName) {<br>

        unique_ptr<Module> Output = run_opt(__FILE__, "IR",<br>

"-passes=loop-vectorize");<br>

        /* Check Output */<br>

      }<br>

    #endif /* IR */<br>

<br>

That is, input IR and check code are in the same file. The run_opt<br>

command is a replica of main() from the opt tool, so any command line<br>

arguments (passes with legacy or new passmanager, cl::opt options,<br>

etc.) can be passed. It also makes converting existing tests simpler.<br>

<br>

The top-level structure is C++ (i.e. the LLVM-IR is removed by the<br>

preprocessor) and compiled with cmake. This allows a<br>

compile_commands.json to be created such that refactoring tools,<br>

clang-tidy, and clang-format can easily be applied on the code. The<br>

second argument to run_opt is the preprocessor directive for the IR<br>

such that multiple IR modules can be embedded into the file.<br>

<br>

Such tests can be compiled in two modes: Either within the LLVM<br>

project, or as an external subproject using llvm_ExternalProject_Add.<br>

The former has the disadvantage that new .cxx files dumped into the<br>

test folder are not recognized until the next cmake run, unless the<br>

CONFIGURE_DEPENDS option is used. I found this adds seconds to each<br>

invocation of ninja which I considered a dealbreaker. The external<br>

project searched for tests every time, but is only invoked in a<br>

check-llvm run, no different than llvm-lit. It uses CMake's<br>

find_package to build against the main project's results (which<br>

currently we do not have tests for) and could also be compiled in<br>

debug mode while LLVM itself is compiled in release mode.<br>

<br>

The checks themselves can be any of gtest's ASSERT/EXPECT macros, but<br>

for common test idioms I suggest to add custom macros, such as<br>

<br>

    ASSERT_ALL_OF(InstList, !isa<VectorType>(I->getType()));<br>

<br>

which on failure prints the instruction that does not return a vector.<br>

Try that with FileCheck. PattenMatch.h from InstCombine can be used as<br>

well. Structural comparison with a reference output could also be<br>

possible (like clang-diff,<br>

[llvm-canon](<a href="http://llvm.org/devmtg/2019-10/talk-abstracts.html#tech12" rel="noreferrer" target="_blank">http://llvm.org/devmtg/2019-10/talk-abstracts.html#tech12</a>),<br>

<a href="https://reviews.llvm.org/D80916" rel="noreferrer" target="_blank">https://reviews.llvm.org/D80916</a>).<br>

<br>

Some additional tooling could be helpful:<br>

<br>

 * A test file creator, taking some IR, wrapping it into the above<br>

structure, and write it into the test directory.<br>

 * A tool for extracting and updating (running opt) the IR inside the<br>

#ifdef, if not even add this functionality to opt itself. This is the<br>

main reason to not just the IR inside a string.<br>

<br>

A Work-in-Progress differential and what it improves over FileCheck<br>

and unittests is available here: <a href="https://reviews.llvm.org/D82426" rel="noreferrer" target="_blank">https://reviews.llvm.org/D82426</a><br>

<br>

Any kind of feedback welcome.<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div>