[llvm-dev] [RFC] Compiled regression tests.

Michael Kruse via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 24 09:43:25 PDT 2020


Am Mi., 24. Juni 2020 um 10:12 Uhr schrieb David Blaikie <dblaikie at gmail.com>:
> > As mentioned in the Differential, generating the tests automatically
> > will lose information about what actually is intended to be tested,
>
> Agreed - and I didn't mean to suggest tests should be automatically
> generated. I work pretty hard in code reviews to encourage tests to be
> written as stable-y as possible so they are resilient to unrelated
> changes. The python scripts I wrote to update tests didn't require
> tests to be written in an automated fashion but were still able to
> migrate the significant majority of hand-written FileCheck test cases.

My argument is that it is hard to impossible to really only test the
relevant bits using FileCheck. CHECK-DAG, named regexes etc are
mechanisms making this possible, but at the same time make the
verification more complicated.
I don't think it is feasible to write single-purpose update scripts
for most changes. Even if there is one, it is even less feasible to
ensure for all tests that they still test what was originally
intended, especially with CHECK-NOT.

I had to put a lot of effort into updating loop metadata tests.
Because metadata nodes have sequential numbers in order the IR emitter
decides to put them, it is tempting to use the form ![[NODENAME:.*]]
for each occurance so you can reorder the lines in the order they
occur, as indeed I found in many regression tests. Three problems with
this:
1. When the regex is specified, it will overwrite the content of
previous placeholders, i.e. if used everywhere, it is equivalent to
{{.*}}
2. Using a backtracking regex engine, there are inputs with
exponential time behaviour.having mistake
3. It will match more than needed, e.g. a list of nodes
{![[NODENAME:.*]]} will also match !{!0, !1} and FileCheck will
complain somewhere else that

!0 = {!0, !1}

does not match

![[NODENAME]] = {![[NODENAME]], ![[LOOP1:.*]]}

(if not the 'regex everywhere' mistake was made)

A "robust" test would use [[NODENAME:[0-9]+]] and CHECK-DAG, as some
tests do, but also making the CHECK lines even longer and more
cluttered. In contrast to instructions, not all metadata lines have
recognizable keywords that could indicate what it might intend to
match. CHECK-DAG will happily match with any metadata node that has
the same number of operands, deferring a mismatch report to some later
CHECK line, but due to the DAG part continuing to match with previous
lines.

Unfortunately, most tests we have do not even use placeholders for
metadata nodes, including those generated by update_test_checks.py. I
looked into improving the script in this regard, but its
implementation is function-centric, making it difficult to add
module-level placeholders.

As a result, I estimate that I had to invest about twice the time to
update/fix tests than writing the code changes themselves. Due to my
experience, I find updating FileCheck tests very frustrating.
I'd prefer not to test whether specific metadata nodes are present,
but whether the LLVM API to query them (such as `isAnnotatedParallel`)
returns the expected result.

Michael


More information about the llvm-dev mailing list