[llvm-dev] FileCheck: using numeric variable defined on same line with caveats

Joel E. Denny via llvm-dev llvm-dev at lists.llvm.org
Thu Jun 18 09:34:34 PDT 2020


So [[#VAR+1]] has two possible behaviors:

1. It matches the next integer, and then FileCheck fails if that integer is
not VAR+1.  This is the behavior if VAR is defined in the same directive.
2. It matches the next integer that is VAR+1, but FileCheck fails if there
is no such integer.  This is the behavior if VAR is defined in a prior
directive.

Is all that correct?

Imagine if someone has learned or assumed behavior #1, perhaps because he's
so far only written numeric variable expressions where VAR is defined in
the same directive.  Then let's say he writes a directive where VAR is
defined in a prior directive:

CHECK: [[#VAR:]]
CHECK: [[#VAR+1]]
Input: 10 11

He's used to behavior #1, so he won't expect that there will be a false
pass when the input evolves to "10 9 11" because the behavior is actually
#2 in this case.

Consider instead having consistent behavior for all occurrences of
[[#VAR+1]].  Perhaps the decision will be not to permit the variable to be
defined in the same directive because it's too hard to implement behavior
#2 in that case.  I believe it is still possible to achieve either behavior
with separate directives.  For example, to convert my directives above from
behavior #2 to behavior #1:

CHECK: [[#VAR1:]]
CHECK-NOT: [[#]]
CHECK: [[#VAR1+1]]

Joel

On Tue, Jun 16, 2020 at 3:56 AM James Henderson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> What Paul said :)
>
> With a false negative, people will probably go, huh? But that's okay, as
> long as it is clear what to do. Documentation and good diagnostic messages
> should both help.
>
> We should avoid false positives at all costs in tests. I've tried thinking
> hard and I can't see any problem in this regards, but I can't say I'd trust
> my "thinking hard" all that much!
>
> James
>
> On Mon, 15 Jun 2020 at 16:52, Robinson, Paul <paul.robinson at sony.com>
> wrote:
>
>> Any kind of variable definition on a CHECK-NOT line would seem like it
>> would be asking for trouble.  Do we allow text variable definitions on a
>> NOT?
>>
>>
>>
>> False fails are better than false matches.  Given that it will fail on a
>> line where you’d expect a match, or possibly for the line to be skipped,
>> it’s a matter of refining the match expression, which is something that you
>> have to do sometimes anyway.  The two-level matching process (regex first,
>> evaluation later) might be surprising to people, and I’d hope the
>> diagnostic would give a hint in that direction.
>>
>> --paulr
>>
>>
>>
>> *From:* Thomas Preud'homme <thomasp at graphcore.ai>
>> *Sent:* Monday, June 15, 2020 10:59 AM
>> *To:* Robinson, Paul <paul.robinson at sony.com>;
>> jh7370.2008 at my.bristol.ac.uk; 'llvm-dev at lists.llvm.org' <
>> llvm-dev at lists.llvm.org>
>> *Subject:* Re: [llvm-dev] FileCheck: using numeric variable defined on
>> same line with caveats
>>
>>
>>
>> Hi Paul,
>>
>>
>>
>> Thanks for your question, for some reason I was thinking of CHECK-DAG
>> matching as trying line by line instead of looking for the first match from
>> the start of the block. To answer the first question, the first CHECK-DAG
>> would fail to match altogether since the regex would match 10 12 as you
>> pointed out which wouldn't satisfy the operation. I don't think we should
>> skip and try matching again as it is difficult in the general case (think
>> about CHECK-DAG: [[#NUMVAR:]]{{.*}}[[#NUMVAR+1]] and how to deal with the
>> same input 10 12 13).
>>
>>
>>
>> So my point is completely moot, for a valid input either a DAG match is
>> found and it's a legitimate match, or a match is not found and the failure
>> will be on the line with the use of a variable defined on the same line
>> which would not be too surprising. My apologies for the confusion.
>>
>>
>>
>> So my questions should thus be:
>>
>>
>>
>>    - are we fine with false negative (failing on valid input due to
>>    regex engine not understanding numeric values)
>>    - can you think of any situation that would lead to a false positive
>>    (directive match on invalid input) besides CHECK-NOT?
>>
>>
>>
>> Best regards,
>>
>>
>>
>> Thomas
>> ------------------------------
>>
>> *From:* Robinson, Paul <paul.robinson at sony.com>
>> *Sent:* 15 June 2020 15:33
>> *To:* jh7370.2008 at my.bristol.ac.uk <jh7370.2008 at my.bristol.ac.uk>;
>> Thomas Preud'homme <thomasp at graphcore.ai>; 'llvm-dev at lists.llvm.org' <
>> llvm-dev at lists.llvm.org>
>> *Subject:* RE: [llvm-dev] FileCheck: using numeric variable defined on
>> same line with caveats
>>
>>
>>
>> Before addressing the CHECK-NOT case, I’m still unclear about the DAG
>> case.
>>
>>
>>
>> What should the first DAG line match?  The regex matching would first
>> attempt to match “10 12” but the expression evaluation would fail; so the
>> DAG candidate wouldn’t match; does this mean the DAG matching does not
>> continue searching, and the test fails?  Or would we restart the search….
>> where?  With “0 12” (skipping only one character from the previous fail)?
>> In that case it would ultimately match “12 13” from the first line.  Or
>> would it skip the entire previous candidate, and start searching at “ 13”?
>> In which case it would ultimately match “10 11” on the second line.
>>
>>
>>
>> In any case (if the first DAG ultimately matches something), the third
>> DAG line would match the first previously unmatched text in the DAG search
>> range, which would be either “10 “ or “10 12 13” from the first line,
>> depending on the answer to the previous paragraph.
>>
>> --paulr
>>
>>
>>
>> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of *James
>> Henderson via llvm-dev
>> *Sent:* Monday, June 15, 2020 4:08 AM
>> *To:* Thomas Preud'homme <thomasp at graphcore.ai>
>> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>
>> *Subject:* Re: [llvm-dev] FileCheck: using numeric variable defined on
>> same line with caveats
>>
>>
>>
>> I think I already gave my opinion on one of the previous patches,
>> regarding CHECK-NOT, which approximately came to the same conclusion as
>> what you've got here, so +1 from me. I also think the CHECK-DAG example is
>> not one to care about. It seems to me that there's no guarantee what CHECK-DAG:
>> [[LINE_AFTER_FOO:.*]] would match, as, if I followed it correctly,
>> CHECK-DAGs don't have any guarantee of order within a group, so it could
>> match either the next line after BEGIN, the line after  [[#VAR1:]]
>> [[#VAR1+1]] or indeed any line before END.
>>
>>
>>
>> James
>>
>>
>>
>> On Thu, 11 Jun 2020 at 12:29, Thomas Preud'homme via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> Hi,
>>
>>
>>
>> TL;DR: Is it ok to allow numeric variables used on same line as defined
>> except for CHECK-NOT and with false negatives?
>>
>>
>>
>> FileCheck does not currently allow a numeric variable from being used on
>> the same line they were defined. I have a tentative patch to add that
>> support but it comes with caveats so before going through review I'd like
>> to get consensus on whether those caveats are acceptable.
>>
>>
>>
>> == The problem ==
>>
>>
>>
>> The problem with matching variables defined on the same line is that the
>> matching is done separately from checking the numeric relation, because
>> numeric relation cannot be expressed in regex. That is, when matching
>> [[#VAR:]] [[#VAR+1]] FileCheck is first matching the input against ([0-9]+)
>> ([0-9]+) and then the value of the two captured integer are checked.
>>
>>
>>
>> This can lead to at times confusing or downward wrong outcomes. Consider
>> the following input with the CHECK pattern mentioned above:
>>
>>
>>
>> 10 12 13
>>
>>
>>
>> The regex would match numbers 10 and 12 and fail the CHECK directive
>> despite 12 and 13 verifying the +1 relation. This could happen as a result
>> of a change in the input after a new commit has landed. In the case of a
>> CHECK directive, it would make the test regress and a developer would need
>> to tighten the pattern somehow, for instance by chaning it for [[#VAR:]]
>> [[#VAR+1]]{{$}}. Now in the context of a CHECK-NOT this could be a change
>> from input 10 12 14 to 10 12 13 and the pattern would still fail to match
>> and thus the test still pass despite the compiler having regressed.
>>
>>
>>
>> == Proposed "solution" ==
>>
>>
>>
>> Given the above, we can summarize the risks of supporting numeric
>> expression using a variable defined on the same line to:
>>
>>
>>
>>    - test regression on positive matching directives (CHECK, CHECK-NEXT,
>>    ...)
>>    - silent compiler regression on negative matching directives
>>    (CHECK-NOT)
>>
>> I am therefore proposing to prevent using numeric variables defined on
>> the same line for negative matching directives but allow it for positive
>> matching directives with a note in the documentation to be careful to make
>> the pattern as tight as possible.
>>
>>
>>
>> == CHECK-DAG case ==
>>
>>
>>
>> CHECK-DAG is interesting because despite it being a positive matching
>> directive, there's a risk with CHECK-DAG in case a test rely on the way
>> CHECK-DAG is implemented. Consider the following directives which rely on
>> each directive being matched in order:
>>
>>
>>
>> CHECK: BEGIN
>>
>> CHECK-DAG: [[#VAR1:]] [[#VAR1+1]]
>>
>> CHECK-DAG: FOO
>>
>> CHECK-DAG: [[LINE_AFTER_FOO:.*]]
>>
>> CHECK: END
>>
>> CHECK-NOT: [[LINE_AFTER_FOO]] BAZ
>>
>>
>>
>> This could be written if the line checked by the first CHECK-DAG is
>> guaranteed to always be either before FOO or after the line after FOO. Now
>> consider the following input that verifies this invariant:
>>
>>
>>
>> BEGIN
>>
>> 10 12 13
>>
>> FOO 10 11
>>
>> FOOBAR
>>
>> END
>>
>> 10 12 13 FOOBAR BAZ
>>
>>
>>
>> The expectation from the test author relying on the CHECK-DAG behavior
>> would be for LINE_AFTER_FOO to have the value FOOBAR once the CHECK-DAG
>> block has matched. However due to the caveats mentioned above it would end
>> up being set to "10 12 13"  and thus the CHECK-NOT would pass because "10
>> 12 13" is not followed by "BAZ". That's far fetched though, I'm not
>> convinced we should worry about this beyond documenting CHECK-DAG as being
>> able to match in any order.
>>
>>
>>
>>
>>
>> Thoughts?
>>
>>
>>
>> Best regards,
>>
>>
>>
>> Thomas
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> <https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!JmoZiZGBv3RvKRSx!u8dXPx858KRkn3NJFFUKY46ZVBaBOz9jKGaTk7iC6v9IhpabzCjCnB1FRnf7DQ0Bbw$>
>>
>>
>>
>> ** We have updated our privacy policy, which contains important
>> information about how we collect and process your personal data. To read
>> the policy, please click here
>> <https://urldefense.com/v3/__http:/www.graphcore.ai/privacy__;!!JmoZiZGBv3RvKRSx!u8dXPx858KRkn3NJFFUKY46ZVBaBOz9jKGaTk7iC6v9IhpabzCjCnB1FRnf98j8GxQ$>
>> **
>>
>> This email and its attachments are intended solely for the addressed
>> recipients and may contain confidential or legally privileged information.
>> If you are not the intended recipient you must not copy, distribute or
>> disseminate this email in any way; to do so may be unlawful.
>>
>> Any personal data/special category personal data herein are processed in
>> accordance with UK data protection legislation.
>> All associated feasible security measures are in place. Further details
>> are available from the Privacy Notice on the website and/or from the
>> Company.
>>
>> Graphcore Limited (registered in England and Wales with registration
>> number 10185006) is registered at 107 Cheapside, London, UK, EC2V 6DN.
>> This message was scanned for viruses upon transmission. However Graphcore
>> accepts no liability for any such transmission.
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200618/dbf6de8d/attachment-0001.html>


More information about the llvm-dev mailing list