[llvm-dev] RFC: [FileCheck] CHECK-DAG for multiple occurrences of string

Joel E. Denny via llvm-dev llvm-dev at lists.llvm.org
Fri May 11 15:55:46 PDT 2018


On Mon, May 7, 2018 at 12:56 PM, <paul.robinson at sony.com> wrote:

> > 1. In a CHECK-DAG group, don't let the matches for patterns overlap.
> > 2. Add a new CHECK-DAG-N directive, where N is some integer, to express
> > that a pattern must have N non-overlapping matches.
>
> I think #1 is much more intuitive and easy to describe/document than #2.
> Changing the meaning of DAG in that way is highly unlikely to affect any
> existing test, IMO.  And if it does, my first expectation is that the test
> wasn't doing what the author intended anyway.
>

I implemented #1, and I now have 105 new unexpected tests failures from
llvm core (58), clang (45), openmp runtime (1), and compiler-rt (1) test
suites.

I've only looked at a few failures so far.  Of those, the problem is the
expected one: in a CHECK-DAG group, multiple patterns match the same line,
so not permitting overlapping matches causes some patterns never to match
successfully.  However, all the overlapping matches so far seem
unintentional.  That is, this new implementation seems to be catching test
errors.

Here are some scenarios I found:

S1. Patterns sometimes identical
--------------------------------

For example, tools/clang/test/OpenMP/for_codegen.cpp has:

```
// TERM_DEBUG-DAG: [[DBG_LOC_START]] = !DILocation(line: [[@LINE-15]],
// TERM_DEBUG-DAG: [[DBG_LOC_END]] = !DILocation(line: [[@LINE-16]],
// TERM_DEBUG-DAG: [[DBG_LOC_CANCEL]] = !DILocation(line: [[@LINE-17]],
```

Patterns are the same except for the names of regex variables referenced.
The variables happen to have the same value, so the patterns match the same
line when overlapping is permitted.

Perhaps the test was designed to handle the possibility that the variables
might have different values one day, and not permitting overlapping matches
would break that case.  My guess is that this was actually unintentional,
so my solution is to rename all the variables to DBG_LOC and remove two of
the above patterns.

S2. Superset before subset
--------------------------

For example, tools/clang/test/OpenMP/target_parallel_codegen.cpp has this:

```
// CHECK-DAG:   store i[[SZ]] {{4|8}}, i[[SZ]]* {{%[^,]+}}
...
// CHECK-DAG:   store i[[SZ]] 4, i[[SZ]]* {{%[^,]+}}
```

Some pattern X matches a superset of some pattern Y.  X appears before Y,
but Y's intended match appears before X's intended match in the output.
The result is that X matches Y's intended match.  If overlapping is not
permitted, Y has nothing to match, and FileCheck complains.  If overlapping
is permitted, Y also matches Y's intended match, and X's intended match is
quietly ignored.  This case is clearly unintentional.  One fix is to move X
after Y.  A cleaner fix is to somehow make X distinct.

Another example of this scenario:

* test/Bitcode/thinlto-function-summary-originalnames.ll

S3. Wrong number of identical patterns
--------------------------------------

This scenario is just an incorrect attempt at the original use case I
wanted to address.

For example,
projects/openmp/runtime/test/ompt/parallel/not_enough_threads.c repeats the
following three times:

```
// CHECK-DAG: {{^}}[[THREAD_ID:[0-9]+]]: ompt_event_implicit_task_begin:
parallel_id=[[PARALLEL_ID]], task_id=[[IMPLICIT_TASK_ID:[0-9]+]]
// CHECK-DAG: {{^}}[[THREAD_ID]]: ompt_event_barrier_begin:
parallel_id=[[PARALLEL_ID]], task_id=[[IMPLICIT_TASK_ID]]
```

The author was obviously expecting CHECK-DAG not to permit overlapping
matches and wanted to check that the above lines appeared three times.
Specifically, he's expecting 3 threads after the master thread, but the run
line sets OMP_THREAD_LIMIT=2.  If overlapping matches are permitted, this
will permit any non-zero number of threads after the master thread.  If
overlapping matches are not permitted, we must increase OMP_THREAD_LIMIT to
at least 4 or reduce the number of pattern repetitions.

S4. CHECK-DAG unneeded
----------------------

For example, test/CodeGen/AMDGPU/store-v3i64.ll has:

```
; GCN-DAG: buffer_store_byte v
; GCN-DAG: buffer_store_byte v
; GCN-DAG: buffer_store_byte v
```

The intention appears to be the same as S3: match a pattern N times.
However, this example doesn't actually need CHECK-DAG at all.  First, all
patterns in the group are the same, so they are not unordered, so CHECK
would have been sufficient.  Second, there's only one occurrence in the
actual output (which I'm assuming is correct), so there should be only one
directive, so it really doesn't matter which of these directives we use.

S5. Incorrect pattern with same match
-------------------------------------

For example, projects/compiler-rt/test/profile/instrprof-visibility.cpp has:

```
// COV-DAG: {{.*}}|{{ *}}1|#ifndef NO_WEAK
...
// COV-DAG: {{.*}}|{{ *}}1|#ifndef NO_WEAK
```

The "1" does not appear in first occurrence in the actual output.  Let's
assume the actual output is correct and the first pattern is incorrect.  If
overlapping is permitted, the first pattern matches the intended line for
the second pattern, which quietly matches that line too.  If overlapping is
not permitted, FileCheck complains that the second pattern has nothing left
to match.

An example where the second pattern is incorrect:

* test/Bitcode/thinlto-summary-local-5.0.ll

An example where the incorrect pattern is identical but it's not as obvious
because a regex variable by a different name (ATOMIC_REDUCE_BARRIER_LOC
instead of REDUCTION_LOC) is captured within it:

* tools/clang/test/OpenMP/for_reduction_codegen_UDR.cpp has:

Path Forward
------------------

These examples seem like strong evidence that permitting overlapping
matches is a significant source of test errors.  Moreover, we have my
original goal that some test authors were apparently assuming was already
supported: to be able to handle sets of strings that are both unordered and
non-unique.  Nevertheless, it's possible that there are use cases, such as
the one suggested by S1, that we lose.

Here are some potential paths forward:

P1. Give up on approach #1 (non-overlapping CHECK-DAG) and go back to #2
(CHECK-DAG-N).  I don't prefer this, but it would be less work.

P2. Make CHECK-DAG non-overlapping but fix all test cases before committing
that change.  Patch reviews will surely be slow.

P3. Make non-overlapping CHECK-DAG a special mode that's off by default
until all test suites have been fixed.

By the way, I think it would be generally useful for FileCheck to have a
debugging mode that prints every match.  It would be particularly useful
while migrating to a non-overlapping CHECK-DAG.

Thoughts?

Joel



> --paulr
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180511/479976df/attachment.html>


More information about the llvm-dev mailing list