[llvm-dev] [RFC] Formalizing FileCheck Features
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Thu May 24 07:33:23 PDT 2018
On 05/24/2018 08:46 AM, via llvm-dev wrote:
> Background
> ----------
>
> FileCheck [0] is a cornerstone testing tool for the LLVM project. It
> has grown new features over the years to meet new needs, but these
> sometimes have surprising and counter-intuitive behavior [1]. This
> has become even more evident in Joel Denny's recent quest to repair
> what seemed like an obvious defect [2] but which led me to the
> conclusion [3] that FileCheck sorely needed a clear, intuitive
> conceptual model.
Thanks for writing this up. I definitely think that it will be good to
add this to FileCheck's documentation.
> And then someone to make it work that way (hi
> Joel!).
>
> Basic Conceptual Model
> ----------------------
>
> FileCheck should operate on the basis of these three fundamental
> concepts.
>
> (1) Search range. This is some substring of the input text where one
> or more directives will do their pattern-matching magic.
>
> (2) Match range. This is a substring of a search range where a
> directive (or in one case, a group of directives) has matched a
> pattern.
>
> (3) Directive groups. These are sequences of adjacent directives that
> operate in a related way on a search range. Directives within a group
> are processed in order, except as noted in the directive description.
>
> Finally we add The Rule: No match ranges may overlap.
>
> (This is largely formalizing what FileCheck already does, except that
> it didn't have The Rule with respect to DAG matches. That's the bug
> that Joel was originally trying to fix, until I stuck my nose into
> it.)
>
> Directive Descriptions Based On Conceptual Model
> ------------------------------------------------
>
> Given the conceptual model, all directives can be defined in terms of
> it. This is possibly going overboard with the formalism but hey, we're
> all compiler geeks here.
>
> CHECK: Scans the search range for a pattern match. Fails if no match
> is found. The end of the match range becomes the start of the search
> range for subsequent directives.
>
> CHECK-SAME: Like CHECK, plus there must be zero newlines prior to the
> start of the match range.
>
> CHECK-NEXT: Like CHECK, plus there must be exactly one newline prior
> to the start of the match range.
>
> CHECK-LABEL: All LABEL directives are processed before any other
> directives. These directives have two effects. First, they act like
> CHECK directives, but also partition the input text into disjoint
> search ranges, delimited by the match ranges of the LABEL directives.
> Second, they partition the remaining directives into Label Groups,
> each of which operates on the corresponding search range. For truly
> pedantic formalism, we can say there are implicit LABEL directives
> matching the start and end of the entire input text, thus all
> non-LABEL directives are always in some Label Group.
>
> CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
> is not executed immediately; instead the next non-NOT directive is
> executed first, and the start of that directive's match range becomes
> the end of the NOT Group's search range.
Both here, and for CHECK-DAG, we should say something about reaching the
end of the input.
-Hal
> (If the next directive is
> LABEL, it has already executed and has a match range, which is already
> the end of the search range.) After the NOT Group's search range is
> defined, each NOT directive scans the range for a match, and fails if
> a match is found.
>
> CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
> is not executed immediately; instead the next non-DAG directive is
> executed first, and the start of that directive's match range becomes
> the end of the DAG Group's search range. If the next directive is
> CHECK-NOT, the end of the DAG Group's search range is
> unaffected. (This might or might not be FileCheck's historical
> behavior; I didn't check.) After the DAG Group's search range is
> defined, each DAG directive scans the range for a match, and fails if
> a match is not found. Per The Rule, match ranges for DAG directives
> may not overlap. (This is not historical FileCheck behavior, and the
> bug Joel Denny wanted to fix.) After all DAG directives run, the
> match range for the entire DAG Group extends from the start of the
> earliest match to the end of the latest match. The end of that match
> range becomes the start of the search range for subsequent directives.
>
> Observations
> ------------
>
> A CHECK-NOT still separates surrounding CHECK-DAG directives into
> disjoint groups, and does not permit matches from the two groups to
> overlap. DAG was originally implemented to detect and diagnose an
> overlap, but this worked only for the first DAG after a NOT. This can
> lead to counter-intuitive behavior and potentially makes certain kinds
> of matches impossible.
>
> Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
> behavior, but it's unlikely to be useful. Putting SAME or NEXT as the
> first directive in a file likewise has defined behavior, matching
> precisely the first or second line (respectively) of the input text.
>
>
> References
> ----------
> [0] https://llvm.org/docs/CommandGuide/FileCheck.html
> [1] https://www.youtube.com/watch?v=4rhW8knj0L8
> [2] https://lists.llvm.org/pipermail/llvm-dev/2018-May/123010.html
> [3] https://reviews.llvm.org/D47106
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list