[llvm-dev] [RFC] Formalizing FileCheck Features

Joel E. Denny via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 19 13:57:04 PDT 2018


Hi Paul,

On Thu, Jun 14, 2018 at 4:29 PM, <paul.robinson at sony.com> wrote:
>
> Speaking of wish lists, I've been thinking it would be nice to have some
> way to apply a NOT pattern among a range of matches:
>
>
>
> CHECK-NOT-PUSH: pattern
>
>
>
> Well, there is the `--implicit-check-not` option, which applies to the
> entire input text; it looks like you want it just for a subrange, though?
>

Right.


>   If you aren't talking about DAGs, then repeating a CHECK-NOT between the
> other directives would work although it's pretty tedious (voice of
> experience) and easy to mess up (voice of experience).
>

Agreed.


> If you have an example where CHECK-DAG-NOT would actually be useful,
>

Yes, but I'd prefer a more general construct that also works without DAG.
That's why I suggested CHECK-NOT-PUSH and POP.  Jessica Paquette described
a use case that I thought suggested she could benefit from that too, but
it's possible I misunderstood her:

http://lists.llvm.org/pipermail/llvm-dev/2018-May/123092.html


> the formalism I'm going for does seem like it would help.
>

Seems to help with either approach.

Thanks.

Joel


>
>
> --paulr
>
>
>
> *From:* Joel E. Denny [mailto:jdenny.ornl at gmail.com]
> *Sent:* Saturday, May 26, 2018 12:11 PM
> *To:* Robinson, Paul
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [RFC] Formalizing FileCheck Features
>
>
>
> Hi Paul,
>
>
>
> On Fri, May 25, 2018 at 10:40 AM, <paul.robinson at sony.com> wrote:
>
> > Should it be possible for CHECK-SAME match range to include newlines?
>
> It is possible to write a regex that matches newlines.  Doing that in
> CHECK-SAME seems a bit odd but I don't think it's worth trying to forbid
> it.
>
>
>
> OK, so SAME has the sense of matching *starting* on the same line rather
> than *within* the same line.  Seems fine.
>
>
>
> > I'd note that, in the case of CHECK-NEXT, that choice can restrict what
> > CHECK-NEXT can match.  That is, it will complain about a match on the
> > previous line rather than skip it and look on the next line.
>
> Ah, so we could define CHECK-NEXT as: move the start of the search
> range past the first newline, then behaves as CHECK-SAME?
>
>
>
> Right.
>
>
>
> But, appending {{.*$}} to the previous pattern should have the same
> effect if you have a CHECK-NEXT that runs into that problem.
>
>
>
> So the current behavior is more flexible even if less intuitive at first
> glance (to me, at least).  It's also more consistent with the way search
> ranges work in general.
>
>
>
> I think this subtlety and this tip should be mentioned in the user
> documentation. Also, because sometimes the previous directive isn't nearby
> or could be one of many directives due to multiple check prefixes, the docs
> should also offer this formula:
>
>
>
> CHECK-SAME: {{.*}}
>
> CHECK-NEXT: your pattern
>
>
>
> And I
> do think it's valuable for SAME and NEXT to tell you they found
> matches but not on the line you asked for. So I'd prefer to leave these
> defined as they are.
>
>
>
> Agreed.
>
>
>
> >> CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
> >> is not executed immediately; instead the next non-NOT directive is
> >> executed first, and the start of that directive's match range becomes
> >> the end of the NOT Group's search range.
> >
> > Based on the following, that wording is not quite right when a DAG
> > group follows, so there should probably be some note about that here.
>
> So, "the next non-NOT directive or DAG group is executed ... the start
> of that directive or group's match range ..." ?
>
>
>
> Sounds good.
>
>
>
> >>  (If the next directive is
> >> LABEL, it has already executed and has a match range, which is already
> >> the end of the search range.)  After the NOT Group's search range is
> >> defined, each NOT directive scans the range for a match, and fails if
> >> a match is found.
> >>
> >> CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
> >> is not executed immediately; instead the next non-DAG directive is
> >> executed first, and the start of that directive's match range becomes
> >> the end of the DAG Group's search range.
> >
> > That's definitely a change from the current behavior.  Currently, the
> > DAG group finds its own end based on the farthest match.
>
> Oh good catch.  Copy-thinko from the NOT description.  NOT is the only
> kind of directive that has deferred execution.
>
> >>  If the next directive is
> >> CHECK-NOT, the end of the DAG Group's search range is
> >> unaffected.
> >
> > Unaffected means that it's as if there's no following directive?  So
> > next CHECK-LABEL (possibly the implicit one at EOF)?  What if there's
> > a CHECK, CHECK-NEXT, or CHECK-SAME after all the DAGs and NOTs?
>
> If DAG doesn't have deferred execution then the end of the search range
> is the next (explicit or implicit) CHECK-LABEL point, end of story.
>
>
>
> >>  After all DAG directives run, the
> >> match range for the entire DAG Group extends from the start of the
> >> earliest match to the end of the latest match.  The end of that match
> >> range becomes the start of the search range for subsequent directives.
> >
> > That last sentence contradicts the first few sentences: the subsequent
> > directive has already been matched.
>
> Right, fixing the previous bug means this sentence says the right thing.
>
>
>
> Yep, I agree it's fixed.
>
>
>
>
> > One point not addressed here is the start of the DAG group's search
> > range.  Currently, if the DAG group is preceded by a NOT group
> > preceded by a DAG group, the last DAG group's search range starts at
> > the start of the first DAG group's match range.  Any matches in the
> > first DAG group's match range produces a reordering error.  This is
> > somewhat similar to the CHECK-SAME and CHECK-NEXT behavior I mentioned
> > earlier: the search ranges permit invalid match ranges and then
> > complain about them in an effort to diagnose mistakes.  However, that
> > restricts what can be matched.
> >
> > I'm not claiming that either behavior is best.  It's not clear to me.
> > The best use of DAG-NOT-DAG is very confusing to me.  An effort to
> > prescribe the right semantics to it needs to be informed by real use
> > cases, in my opinion.
>
> I did some email archaeology, and found this exchange on llvm-dev between
> myself and Michael Liao (original DAG implementor) 13 Mar 2016:
>
> pr> Commentary in FileCheck itself can easily be interpreted to mean the
> pr> intent was that –NOT would scan the region between the points defined
> pr> by the last match of the preceding DAG group (which the code gets
> pr> right) and the first match of the following DAG group (which the code
> pr> does not get right). But the commentary is not really that clear.
>
> ml> That's the intention of the original design. CHECK-NOT never occurs
> ml> before we find the start point (the start of file by default) and end
> ml> point (the end of file by default.) All other points are through other
> ml> CHECKs, including CHECK-DAG but excluding CHECK-NOT.  So that, if you
> ml> use CHECK-NOT, you need to be aware of how that range is defined. As
> ml> CHECK-DAG pattern matches a group of pattern in any order, the match
> ml> point of that group of CHECK-DAG (a consecutive CHECK-DAGs without any
> ml> other CHECKs interleaved) is always the point where one of that pgroup
> ml> is matched. If one CHECK-DAG is separated by any other CHECKs
> ml> (including CHECK-NOT) from preceding CHECK-DAGs, it is not in the
> ml> preceding group of CHECK-DAG. That's way how we could check the order
> ml> where a group of patterns should never occur before another group of
> ml> patterns.
>
>
>
> Thanks for digging that up.
>
>
>
> So, I believe my specification for the interaction between DAG and NOT
> does match the original intent.
>
>
>
> I can't argue there.
>
>
>
>   Regarding the diagnostic aid, it does
> make some sequences really hard to match,
>
>
>
> Theoretically, I agree.  But do you know of a real use case where it's a
> problem?
>
>
>
> and I don't have a general
>
> idea how to fix that (versus {{.*$}} for the similar NEXT situation).
>
>
>
> Me neither.
>
>
>
> It's also a reasonable continuation of the behavior of plain CHECK, in
> that a second CHECK doesn't search the prior text to complain about
> ordering issues.
>
>
>
> Good point.
>
>
>
> The main difference I see is that DAG is specifically about unordered text
> (and it might vary from run to run in the parallel programs I'm thinking
> of), so the chances of accidental reordering might be higher than with
> plain CHECK.
>
>
>
>
> SAME and NEXT are, I think, a different category; that has to do with
> line-breaks that are not explicitly described by user-written patterns,
> and my own experience is that it's helpful to be told that something
> matches but isn't on the line I expected.
>
>
>
> Agreed.
>
>
>
>
> So, I don't have a definitive answer for changing DAG-NOT-DAG, but
> intuitively the spec makes sense to me and my inclination is to think
> the diagnostic isn't hugely valuable.
>
>
>
> You might be right. Again, I find it hard to think of solid arguments
> about DAG-NOT-DAG because it seems like such an unlikely use case.
>
>
>
> You mentioned Chris Lattner's point.  DAG-NOT-DAG was the first thing that
> came to my mind.
>
>
>
> DAG-NOT-DAG is a weird case where (1) you want two or more consecutive but
> non-overlapping DAG groups, and (2) you want to exclude certain patterns in
> between.  Strangely, with existing directives, you cannot accomplish #1
> without #2, right?  Why do those go together?  It feels like a use case
> that arose from an accident in a language specification and not from a real
> need.
>
>
>
> Well, maybe the best approach is just to go with a clear specification (as
> you have now) and hope for the best.
>
>
>
>
> >> Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
> >> behavior, but it's unlikely to be useful.
> >
> > I believe they had predictable behavior before (their search ranges
> > started at the end of the match range for the entire CHECK-DAG), but
> > it's different with the above description (they define the end of the
> > search range for the preceding CHECK-DAG group).
>
> You're right, it was predictable before, and I am fixing the bug where
> the directive after DAG gets executed first so the range isn't affected.
>
>
>
> Makes sense, so your specification keeps the old behavior.
>
>
>
> Taking Chris Lattner's point into consideration, we might want to say
> SAME or NEXT after a DAG should be an error.  But we could also leave
> that for a later round.
>
>
>
> With your specification, I think the meaning of those cases is clear and
> potentially useful.  The only potential problem I see is that people who
> haven't studied your specification carefully might think SAME and NEXT
> constrain the end of the search range of the DAG group.  It might be
> worthwhile to emphasize in the docs that, no, really, DAG does not work
> that way.
>
>
>
> Actually, I wish there were a way to do that for the sake of matching
> unordered text on a single line.  SAME after DAGs is as close as I can get
> to that.  Maybe we need a CHECK-DAG-SAME.
>
>
>
> Speaking of wish lists, I've been thinking it would be nice to have some
> way to apply a NOT pattern among a range of matches:
>
>
>
> CHECK-NOT-PUSH: pattern
>
> ...
>
> CHECK-NOT-POP:
>
>
>
> For example, with a pattern of {{.}} and DAGs in between PUSH and POP, I
> can check for an unordered set of strings while rejecting any other text
> among them. (Now that's a use case for DAG plus NOT that seems very clear
> to me.)
>
>
>
> Like normal NOT, PUSH's action would be deferred until the next directive
> or group.  At that point, it would push the specified NOT pattern along
> with the next non-NOT directive's match range end as its search range
> start. POP would pop and apply those using the previous non-NOT directive's
> match range start as its search range end.  The Rule would apply to its
> matches.  PUSH and POP would be like normal NOT in terms of their effect on
> neighboring directives: each would terminate any preceding DAG group, and,
> because there's no match in a successful run, each would have no effect on
> any neighboring directive's search range.  PUSH and POP with no
> directives in between other than those in the NOT family would be an error.
>
>
>
> Your formal specification of FileCheck makes it straight-forward to
> describe this behavior precisely.
>
>
>
>
> --paulr
>
> P.S. I am away next week but expect to keep an eye on the lists.
>
>
>
> Sure.  Have fun.  No rush.
>
>
>
> Thanks.
>
>
>
> Joel
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180619/a5cf147f/attachment.html>


More information about the llvm-dev mailing list