[llvm-dev] [RFC] Formalizing FileCheck Features
via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 14 13:38:55 PDT 2018
Spec for the model, version 2. If this survives I'll start on
amendments to the FileCheck doc.
--paulr
Basic Conceptual Model
----------------------
FileCheck should operate on the basis of these three fundamental
concepts.
(1) Search range. This is some substring of the input text where one
or more directives will do their pattern-matching magic.
(2) Match range. This is a substring of a search range where a
directive (or in one case, a group of directives) has matched a
pattern.
(3) Directive groups. These are sequences of adjacent directives that
operate in a related way on a search range. Directives within a group
are processed in order, except as noted in the directive description.
Finally we add The Rule: No match ranges may overlap.
Directive Descriptions Based On Conceptual Model
------------------------------------------------
Given the conceptual model, all directives can be defined in terms of
it.
CHECK: Scans the search range for a pattern match. Fails if no match
is found. The end of the match range becomes the start of the search
range for subsequent directives.
CHECK-SAME: Like CHECK, plus there must be zero newlines within the
search range prior to the start of the match range.
CHECK-NEXT: Like CHECK, plus there must be exactly one newline within
the search range prior to the start of the match range.
Note: This definition means CHECK-NEXT will fail if the pattern
occurs both on the line where the search range starts, and on the
(expected) next line. This can be avoided by putting a
`CHECK-SAME: {{.*}}` before the CHECK-NEXT. We could also avoid
this by defining the CHECK-NEXT search range to be just the following
line of text. We define CHECK-NEXT the way we do because it seems
valuable to diagnose mismatches that are simply on the wrong line,
and the problematic case is rare.
CHECK-LABEL: All LABEL directives are processed before any other
directives. These directives have three effects. First, they act like
CHECK directives. Second, they partition the input text into disjoint
search ranges, delimited by the match ranges of the LABEL directives.
Third, they partition the remaining directives into Label Groups,
each of which operates on the corresponding search range. For truly
pedantic formalism, we can say there are implicit LABEL directives
matching the start and end of the entire input text, thus all
non-LABEL directives are always in some Label Group and there is
really nothing special about the end of the input text.
CHECK-NOT: A sequence of one or more consecutive NOT directives forms
a NOT Group. The group is not executed immediately; instead the next
non-NOT directive (or DAG Group, if the next directive is DAG) is
executed first, and the start of that directive's (or group's)
match range becomes the end of the NOT Group's search range. (If the
next directive is LABEL, it has already executed and has a match range,
which is already the end of the search range. If the NOT is the last
directive, the search range extends to the end of the input.) After
the NOT Group's search range is defined, each NOT directive in the
group scans the range for a match, and fails if a match is found.
CHECK-DAG: A sequence of one or more consecutive DAG directives forms
a DAG Group. The search range for the group extends from the end of
the previous match (or start of the input, if there is no previous
directive) to the start of the next LABEL match, or to the end of the
input if there is no later LABEL. Each directive in the DAG group
scans the search range of the group looking for a pattern match. A
directive fails if no match is found. Per The Rule, match ranges for
the individual DAG directives in a group may not overlap. After all
DAG directives run, the match range for the entire DAG Group extends
from the start of the earliest match to the end of the latest match.
The end of that match range becomes the start of the search range for
subsequent directives.
Observations
------------
A CHECK-NOT surrounded by CHECK-DAG directives separates the DAGs into
disjoint groups, and does not permit matches from the two groups to
overlap. DAG was originally implemented to detect and diagnose an
overlap in this situation, but the implementation worked only for the
first DAG after a NOT. This can lead to counter-intuitive behavior and
potentially makes certain kinds of matches impossible.
Technically, putting CHECK-SAME or CHECK-NEXT after CHECK-DAG has
defined behavior, but it's unlikely to be useful, so FileCheck rejects
that kind of sequence. Similarly, putting SAME or NEXT as the
first directive in a file likewise has defined behavior (matching
precisely the first or second line respectively of the input text);
however this is far more likely to be a mistake than to be useful, so
again FileCheck rejects this.
More information about the llvm-dev
mailing list