[llvm-dev] [RFC] Compiled regression tests.

Michael Kruse via llvm-dev llvm-dev at lists.llvm.org
Wed Jul 1 09:33:05 PDT 2020


Am Mi., 1. Juli 2020 um 10:18 Uhr schrieb Robinson, Paul <
paul.robinson at sony.com>:

> The test as written is fragile because it requires a certain ordering.  If
> the output order is not important, use CHECK-DAG rather than CHECK.  This
> would be a failure to understand the testing tool.
>

CHECK-DAG does not help here since what changes is within a list on the
same line, and we have no CHECK-SAME-DAG or CHECK-DAG-SAME. Even if we had
it, the actual line that changed is textually the same and FileCheck would
need to backtrack deep into the following lines for alternative placeholder
substitutions. It would look like

CHECK-SAME-DAG: ![[ACCESS_GROUP_INNER:[0-9]+]]
CHECK-SAME-DAG: ,
CHECK-SAME-DAG: ![[ACCESS_GROUP_OUTER:[0-9]+]]

which allows the comma to appear anywhere and I don't find readable.

My (naive?) conclusion is that textural checking is not the right tool here.


> My experience, over a 40-year career, is that good software developers are
> generally not very good test-writers.  These are different skills and good
> testing is frequently not taught.  It’s easy to write fragile tests; you
> make your patch, you see what comes out, and you write the test to expect
> exactly that output, using the minimum possible features of the testing
> tool.  This is poor technique all around.  We even have scripts that
> automatically generate such tests, used primarily in codegen tests.  I
> devoutly hope that the people who produce those tests responsibly eyeball
> all those cases.
>
>
>
> The proposal appears to be to migrate output-based tests (using
> ever-more-complicated FileCheck features) to executable tests, which makes
> it more like the software development people are used to instead of
> test-writing.  But I don’t see that necessarily solving the problem; seems
> like it would be just as easy to write a program that doesn’t do the right
> thing as to write a FileCheck test that doesn’t do the right thing.
>

IMHO having a tool that allows to better express what is intended to be
tested is already worth a lot. For instance, we usually don't care about
SSA value names or MDNode numbers, but we have to put extra work into
regex-ing away those names in FileCheck tests and as a result, most tests
we have do still expect the exact number for metadata nodes. This is a
problem if we we want to emit new metadata nodes in that all those tests
need to be updated.

This problem goes away if the test method by default ignored value
names/MDNode numbers and software development people had to put extra work
if they actually want to verify this.



>
> Hal’s suggestion is more to the point:  If the output we’re generating is
> not appropriate to the kinds of tests we want to perform, it can be
> worthwhile to generate different kinds of output.  MIR is a case in point;
> for a long time it was hard to introspect into the interval between IR and
> final machine code, but now it’s a lot easier.
>

Can you elaborate on what makes it easier?


Michael


>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200701/f1ae18f1/attachment.html>


More information about the llvm-dev mailing list