[llvm-dev] [RFC] Formalizing FileCheck Features

Joel E. Denny via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 19 13:51:07 PDT 2018


Hi Paul,

I've inlined some minor suggestions and questions.

On Thu, Jun 14, 2018 at 4:38 PM, <paul.robinson at sony.com> wrote:

> Spec for the model, version 2.  If this survives I'll start on
> amendments to the FileCheck doc.
> --paulr
>
>
> Basic Conceptual Model
> ----------------------
>
> FileCheck should operate on the basis of these three fundamental
> concepts.
>

"should operate" -> "operates"


>
> (1) Search range.  This is some substring of the input text where one
> or more directives will do their pattern-matching magic.
>
> (2) Match range.  This is a substring of a search range where a
> directive (or in one case, a group of directives) has matched a
> pattern.
>
> (3) Directive groups.  These are sequences of adjacent directives that
> operate in a related way on a search range.  Directives within a group
> are processed in order, except as noted in the directive description.
>

Is there an exception?


>
> Finally we add The Rule:  No match ranges may overlap.
>
>
> Directive Descriptions Based On Conceptual Model
> ------------------------------------------------
>
> Given the conceptual model, all directives can be defined in terms of
> it.
>
> CHECK: Scans the search range for a pattern match. Fails if no match
> is found.  The end of the match range becomes the start of the search
> range for subsequent directives.
>
> CHECK-SAME: Like CHECK, plus there must be zero newlines within the
> search range prior to the start of the match range.
>
> CHECK-NEXT: Like CHECK, plus there must be exactly one newline within
> the search range prior to the start of the match range.
>
> Note: This definition means CHECK-NEXT will fail if the pattern
> occurs both on the line where the search range starts, and on the
> (expected) next line.


The first occurrence is sufficient for a failure.  Perhaps: "and on the" ->
"even if it also occurs on the"


>   This can be avoided by putting a
> `CHECK-SAME: {{.*}}` before the CHECK-NEXT.  We could also avoid
>

To make it clearer to the naive user you're not describing a second option
he can also try as a user: "We could also avoid" -> "We could have
implemented FileCheck to avoid"


> this by defining the CHECK-NEXT search range to be just the following
> line of text.  We define CHECK-NEXT the way we do because it seems
> valuable to diagnose mismatches that are simply on the wrong line,
> and the problematic case is rare.
>

By the way, do you think it would be helpful for the diagnostic to suggest
the CHECK-SAME trick?

CHECK-LABEL: All LABEL directives are processed before any other
> directives.  These directives have three effects.  First, they act like
> CHECK directives. Second, they partition the input text into disjoint
> search ranges, delimited by the match ranges of the LABEL directives.
> Third, they partition the remaining directives into Label Groups,
> each of which operates on the corresponding search range.  For truly
> pedantic formalism, we can say there are implicit LABEL directives
> matching the start and end of the entire input text, thus all
> non-LABEL directives are always in some Label Group and there is
> really nothing special about the end of the input text.
>
> CHECK-NOT: A sequence of one or more consecutive NOT directives forms
> a NOT Group. The group is not executed immediately; instead the next
> non-NOT directive (or DAG Group, if the next directive is DAG) is
> executed first, and the start of that directive's (or group's)
> match range becomes the end of the NOT Group's search range.  (If the
> next directive is LABEL, it has already executed and has a match range,
> which is already the end of the search range.  If the NOT is the last
> directive, the search range extends to the end of the input.)  After
> the NOT Group's search range is defined, each NOT directive in the
> group scans the range for a match, and fails if a match is found.
>
> CHECK-DAG: A sequence of one or more consecutive DAG directives forms
> a DAG Group. The search range for the group extends from the end of
> the previous match (or start of the input, if there is no previous
>
directive) to the start of the next LABEL match, or to the end of the
> input if there is no later LABEL.


It reads to me like LABEL is relevant to the end but not the start.  You
might replace "(or start of the input" with "(possibly a LABEL or start of
the input".

On the other hand, in most of your directive descriptions (see CHECK,
CHECK-NEXT, and CHECK-SAME), you don't define the directive's own search
range.  Instead, you define how that directive impacts the start of the
next search range.

The only difference here is that you have an entire group of directives
with the same search range.  As FileCheck grows new directives, perhaps a
more maintainable way to describe the search ranges for NOT groups and DAG
groups is as follows:

"The search range for every member of the group is the search range that
any single CHECK directive would have if it were to replace the entire
group."

  Each directive in the DAG group
> scans the search range of the group looking for a pattern match. A
> directive fails if no match is found. Per The Rule, match ranges for
> the individual DAG directives in a group may not overlap.


The last sentence is ambiguous.  It could mean you'll get a diagnostic if
they do overlap.  Perhaps say "Per The Rule, each group member skips past
any match whose range overlaps the range of an earlier group member's
match."



>   After all
> DAG directives run, the match range for the entire DAG Group extends
> from the start of the earliest match to the end of the latest match.
> The end of that match range becomes the start of the search range for
> subsequent directives.
>
> Observations
> ------------
>
> A CHECK-NOT surrounded by CHECK-DAG directives separates the DAGs into
>

"A CHECK-NOT" -> "One or more CHECK-NOTs"

>
> disjoint groups, and does not permit matches from the two groups to
> overlap. DAG was originally implemented to detect and diagnose an
> overlap in this situation, but the implementation worked only for the
> first DAG after a NOT. This can lead to counter-intuitive behavior and
> potentially makes certain kinds of matches impossible.
>

By the way, I have a patch that fixes the search ranges for DAG-NOT-DAG to
match your formal description here.  I need to polish up the commit log,
and then I'll post it for review.  It applies after my other patches
because it was easier to implement that way.

Thanks.

Joel


>
>
> Technically, putting CHECK-SAME or CHECK-NEXT after CHECK-DAG has
> defined behavior, but it's unlikely to be useful, so FileCheck rejects
> that kind of sequence.  Similarly, putting SAME or NEXT as the
> first directive in a file likewise has defined behavior (matching
> precisely the first or second line respectively of the input text);
> however this is far more likely to be a mistake than to be useful, so
> again FileCheck rejects this.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180619/1649b063/attachment-0001.html>


More information about the llvm-dev mailing list