[llvm-dev] [RFC] Compiled regression tests.

Michael Kruse via llvm-dev llvm-dev at lists.llvm.org
Sun Jun 28 18:20:44 PDT 2020


Am So., 28. Juni 2020 um 13:42 Uhr schrieb Joel E. Denny
<jdenny.ornl at gmail.com>:
> You propose using the preprocessor for mixing C++ test code with its input code, such as LLVM IR.  Of course, LIT, FileCheck, `clang -verify`, and unit tests all enable mixing their various forms of test code with input code.  To my eyes, a difference between unit tests vs. LIT, FileCheck, and `clang -verify` tests is that the latter tend to make the input code more prominent (I prefer that) and make it easier to clarify which test code is associated with which input code.  So far, I think the preprocessor approach is better in this regard than unit tests, but what about a more familiar syntax, like the following?
>
> ```
> // RUN-CXX: #include "compiledtestboilerplate.h"
> // RUN-CXX: unique_ptr<Module> Output = run_opt("%s", "-passes=loop-vectorize");
>
> // RUN-CXX: /* Check func1 Output */
> define void @func1() {
> entry:
>   ret
> }
>
> // RUN-CXX: /* Check func2 Output */
> define void @func2() {
> entry:
>   ret
> }
> ```
>
> It seems it should be feasible to automate extraction of the `RUN-CXX:` code for compilation and for analysis by clang-format and clang-tidy.  Perhaps there would be a new script that extracts at build time for all such uses.  But there are other possibilities that could be considered: a LIT extension for `RUN-CXX:`, C++ JIT compilation, a clang-format-diff.py extension that greps modified files for `RUN-CXX:`, etc.

I think there is a significant gain of having the C++ code at the
top-level, rather than the IR. There are more tools for C++ of which
some, such as IDEs, include-what-you-use, youcompleteme, etc are
external whereas LLVM-IR is project-specific.

If func1 is unrelated to func2, I really think these should not be in
the same module. Consider this:

static const char *Func1IR = R"IR(
define void @func1() {
entry:
  ret
}
)IR";
TEST(MyTests, TestFunc1) {
  unique_ptr<Module> Output = run_opt(Func1IR, "-passes=loop-vectorize");
   /* Check func1 Output */
}


static const char *Func2IR = R"IR(
define void @func2() {
entry:
  ret
}
)IR";
TEST(MyTests, TestFunc2) {
  unique_ptr<Module> Output = run_opt(Func2IR, "-passes=loop-vectorize");
   /* Check func2 Output */
}

Part of the reason is that not all of what makes a function is
syntactically contained within that function definition: Forward
declarations, types, function attributes, metadata, etc.
When using update_test_checks.py, these are per-function anyways.

However, for cases where it is indeed beneficial, there could be some
preprocessing of the embedded string taking place. Maybe:

static const char *Func1IR = R"IR(
define i32 @func1() {
entry:
  %two = add i32 1, 1  ; ASSERT_TRUE(isa<AddInst>(F["two"]))
  ret i32 %two
}
)IR";

// All collected ASSERT_TRUE emitted into another file
// #line preprocessor hints could point to the original line.
#include "inlineasserts.inc"


> I also have a vague feeling that something like this has been discussed before.  If so, please just point me to the discussion.

I am not aware of any such previous discussion.


Michael


More information about the llvm-dev mailing list