<div dir="ltr"><div>What Paul said :)</div><div><br></div><div>With a false negative, people will probably go, huh? But that's okay, as long as it is clear what to do. Documentation and good diagnostic messages should both help.</div><div><br></div><div>We should avoid false positives at all costs in tests. I've tried thinking hard and I can't see any problem in this regards, but I can't say I'd trust my "thinking hard" all that much!</div><div><br></div><div>James<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 15 Jun 2020 at 16:52, Robinson, Paul <<a href="mailto:paul.robinson@sony.com">paul.robinson@sony.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US">
<div class="gmail-m_-3158158831764201663WordSection1">
<p class="MsoNormal">Any kind of variable definition on a CHECK-NOT line would seem like it would be asking for trouble. Do we allow text variable definitions on a NOT?<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">False fails are better than false matches. Given that it will fail on a line where you’d expect a match, or possibly for the line to be skipped, it’s a matter of refining the match expression, which is something that you have to do sometimes
anyway. The two-level matching process (regex first, evaluation later) might be surprising to people, and I’d hope the diagnostic would give a hint in that direction.<u></u><u></u></p>
<p class="MsoNormal">--paulr<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div style="border-color:currentcolor currentcolor currentcolor blue;border-style:none none none solid;border-width:medium medium medium 1.5pt;padding:0in 0in 0in 4pt">
<div>
<div style="border-color:rgb(225,225,225) currentcolor currentcolor;border-style:solid none none;border-width:1pt medium medium;padding:3pt 0in 0in">
<p class="MsoNormal"><b>From:</b> Thomas Preud'homme <<a href="mailto:thomasp@graphcore.ai" target="_blank">thomasp@graphcore.ai</a>> <br>
<b>Sent:</b> Monday, June 15, 2020 10:59 AM<br>
<b>To:</b> Robinson, Paul <<a href="mailto:paul.robinson@sony.com" target="_blank">paul.robinson@sony.com</a>>; <a href="mailto:jh7370.2008@my.bristol.ac.uk" target="_blank">jh7370.2008@my.bristol.ac.uk</a>; '<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>' <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> Re: [llvm-dev] FileCheck: using numeric variable defined on same line with caveats<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black">Hi Paul,<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black">Thanks for your question, for some reason I was thinking of CHECK-DAG matching as trying line by line instead of looking for the first match from the start of the block. To answer the first question,
the first CHECK-DAG would fail to match altogether since the regex would match 10 12 as you pointed out which wouldn't satisfy the operation. I don't think we should skip and try matching again as it is difficult in the general case (think about CHECK-DAG:
[[#NUMVAR:]]{{.*}}[[#NUMVAR+1]] and how to deal with the same input 10 12 13).<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black">So my point is completely moot, for a valid input either a DAG match is found and it's a legitimate match, or a match is not found and the failure will be on the line with the use of a variable
defined on the same line which would not be too surprising. My apologies for the confusion.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black">So my questions should thus be:<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<ul type="disc">
<li class="MsoNormal" style="color:black">
<span style="font-size:12pt">are we fine with false negative (failing on valid input due to regex engine not understanding numeric values)<u></u><u></u></span></li><li class="MsoNormal" style="color:black">
<span style="font-size:12pt">can you think of any situation that would lead to a false positive (directive match on invalid input) besides CHECK-NOT?<u></u><u></u></span></li></ul>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black">Best regards,<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;color:black">Thomas<u></u><u></u></span></p>
</div>
</div>
<div class="MsoNormal" style="text-align:center" align="center">
<hr width="98%" size="2" align="center">
</div>
<div id="gmail-m_-3158158831764201663divRplyFwdMsg">
<p class="MsoNormal"><b><span style="color:black">From:</span></b><span style="color:black"> Robinson, Paul <<a href="mailto:paul.robinson@sony.com" target="_blank">paul.robinson@sony.com</a>><br>
<b>Sent:</b> 15 June 2020 15:33<br>
<b>To:</b> <a href="mailto:jh7370.2008@my.bristol.ac.uk" target="_blank">jh7370.2008@my.bristol.ac.uk</a> <<a href="mailto:jh7370.2008@my.bristol.ac.uk" target="_blank">jh7370.2008@my.bristol.ac.uk</a>>; Thomas Preud'homme <<a href="mailto:thomasp@graphcore.ai" target="_blank">thomasp@graphcore.ai</a>>;
'<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>' <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> RE: [llvm-dev] FileCheck: using numeric variable defined on same line with caveats</span>
<u></u><u></u></p>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal">Before addressing the CHECK-NOT case, I’m still unclear about the DAG case.<u></u><u></u></p>
<p class="gmail-m_-3158158831764201663xmsonormal"> <u></u><u></u></p>
<p class="gmail-m_-3158158831764201663xmsonormal">What should the first DAG line match? The regex matching would first attempt to match “10 12” but the expression evaluation would fail; so the DAG candidate wouldn’t match; does this mean the DAG matching does not continue searching,
and the test fails? Or would we restart the search…. where? With “0 12” (skipping only one character from the previous fail)? In that case it would ultimately match “12 13” from the first line. Or would it skip the entire previous candidate, and start
searching at “ 13”? In which case it would ultimately match “10 11” on the second line.<u></u><u></u></p>
<p class="gmail-m_-3158158831764201663xmsonormal"> <u></u><u></u></p>
<p class="gmail-m_-3158158831764201663xmsonormal">In any case (if the first DAG ultimately matches something), the third DAG line would match the first previously unmatched text in the DAG search range, which would be either “10 “ or “10 12 13” from the first line, depending on the answer
to the previous paragraph.<u></u><u></u></p>
<p class="gmail-m_-3158158831764201663xmsonormal">--paulr<u></u><u></u></p>
<p class="gmail-m_-3158158831764201663xmsonormal"> <u></u><u></u></p>
<div style="border-color:currentcolor currentcolor currentcolor blue;border-style:none none none solid;border-width:medium medium medium 1.5pt;padding:0in 0in 0in 4pt">
<div>
<div style="border-color:rgb(225,225,225) currentcolor currentcolor;border-style:solid none none;border-width:1pt medium medium;padding:3pt 0in 0in">
<p class="gmail-m_-3158158831764201663xmsonormal"><b>From:</b> llvm-dev <<a href="mailto:llvm-dev-bounces@lists.llvm.org" target="_blank">llvm-dev-bounces@lists.llvm.org</a>>
<b>On Behalf Of </b>James Henderson via llvm-dev<br>
<b>Sent:</b> Monday, June 15, 2020 4:08 AM<br>
<b>To:</b> Thomas Preud'homme <<a href="mailto:thomasp@graphcore.ai" target="_blank">thomasp@graphcore.ai</a>><br>
<b>Cc:</b> llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> Re: [llvm-dev] FileCheck: using numeric variable defined on same line with caveats<u></u><u></u></p>
</div>
</div>
<p class="gmail-m_-3158158831764201663xmsonormal"> <u></u><u></u></p>
<div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal">I think I already gave my opinion on one of the previous patches, regarding CHECK-NOT, which approximately came to the same conclusion as what you've got here, so +1 from me. I also think the CHECK-DAG example is not one to care about.
It seems to me that there's no guarantee what <span style="font-family:Consolas">
CHECK-DAG: [[LINE_AFTER_FOO:.*]]</span> would match, as, if I followed it correctly, CHECK-DAGs don't have any guarantee of order within a group, so it could match either the next line after BEGIN, the line after
<span style="font-family:Consolas">[[#VAR1:]] [[#VAR1+1]]</span> or indeed any line before END.<u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"> <u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal">James<u></u><u></u></p>
</div>
</div>
<p class="gmail-m_-3158158831764201663xmsonormal"> <u></u><u></u></p>
<div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal">On Thu, 11 Jun 2020 at 12:29, Thomas Preud'homme via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<u></u><u></u></p>
</div>
<blockquote style="border-color:currentcolor currentcolor currentcolor rgb(204,204,204);border-style:none none none solid;border-width:medium medium medium 1pt;padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
<div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">Hi,</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">TL;DR: Is it ok to allow numeric variables used on same line as defined except for CHECK-NOT and with false negatives?</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">FileCheck does not currently allow a numeric variable from being used on the same line they were defined. I have a tentative patch to add that support but it comes with caveats so before going
through review I'd like to get consensus on whether those caveats are acceptable.</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">== The problem ==</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">The problem with matching variables defined on the same line is that the matching is done separately from checking the numeric relation, because numeric relation cannot be expressed in regex.
That is, when matching [[#VAR:]] [[#VAR+1]] FileCheck is first matching the input against ([0-9]+) ([0-9]+) and then the value of the two captured integer are checked.</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">This can lead to at times confusing or downward wrong outcomes. Consider the following input with the CHECK pattern mentioned above:</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">10 12 13</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">The regex would match numbers 10 and 12 and fail the CHECK directive despite 12 and 13 verifying the +1 relation. This could happen as a result of a change in the input after a new commit has
landed. In the case of a CHECK directive, it would make the test regress and a developer would need to tighten the pattern somehow, for instance by chaning it for [[#VAR:]] [[#VAR+1]]{{$}}. Now in the context of a CHECK-NOT this could be a change from input
10 12 14 to 10 12 13 and the pattern would still fail to match and thus the test still pass despite the compiler having regressed.</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">== Proposed "solution" ==</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">Given the above, we can summarize the risks of supporting numeric expression using a variable defined on the same line to:</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<ul style="margin-top:0in" type="disc">
<li class="gmail-m_-3158158831764201663xmsonormal" style="color:black"><span style="font-size:12pt">test regression on positive matching directives (CHECK, CHECK-NEXT, ...)</span><u></u><u></u></li><li class="gmail-m_-3158158831764201663xmsonormal" style="color:black"><span style="font-size:12pt">silent compiler regression on negative matching directives (CHECK-NOT)</span><u></u><u></u></li></ul>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">I am therefore proposing to prevent using numeric variables defined on the same line for negative matching directives but allow it for positive matching directives with a note in the documentation
to be careful to make the pattern as tight as possible.</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">== CHECK-DAG case ==</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">CHECK-DAG is interesting because despite it being a positive matching directive, there's a risk with CHECK-DAG in case a test rely on the way CHECK-DAG is implemented. Consider the following directives
which rely on each directive being matched in order:</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">CHECK: BEGIN</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">CHECK-DAG: [[#VAR1:]] [[#VAR1+1]]</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">CHECK-DAG: FOO</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">CHECK-DAG: [[LINE_AFTER_FOO:.*]]</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">CHECK: END</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">CHECK-NOT: [[LINE_AFTER_FOO]] BAZ</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">This could be written if the line checked by the first CHECK-DAG is guaranteed to always be either before FOO or after the line after FOO. Now consider the following input that verifies this invariant:</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">BEGIN</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">10 12 13</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">FOO 10 11</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">FOOBAR</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">END</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;font-family:Consolas;color:black">10 12 13 FOOBAR BAZ</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">The expectation from the test author relying on the CHECK-DAG behavior would be for LINE_AFTER_FOO to have the value FOOBAR once the CHECK-DAG block has matched. However due to the caveats mentioned
above it would end up being set to "10 12 13" and thus the CHECK-NOT would pass because "10 12 13" is not followed by "BAZ". That's far fetched though, I'm not convinced we should worry about this beyond documenting CHECK-DAG as being able to match in any
order.</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">Thoughts?</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">Best regards,</span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black"> </span><u></u><u></u></p>
</div>
<div>
<p class="gmail-m_-3158158831764201663xmsonormal"><span style="font-size:12pt;color:black">Thomas</span><u></u><u></u></p>
</div>
</div>
</div>
<p class="gmail-m_-3158158831764201663xmsonormal">_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!JmoZiZGBv3RvKRSx!u8dXPx858KRkn3NJFFUKY46ZVBaBOz9jKGaTk7iC6v9IhpabzCjCnB1FRnf7DQ0Bbw$" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><u></u><u></u></p>
</blockquote>
</div>
</div>
</div>
</div>
<p class="MsoNormal"><br>
<br>
<span style="font-size:7.5pt">** We have updated our privacy policy, which contains important information about how we collect and process your personal data. To read the policy, please click
<a href="https://urldefense.com/v3/__http:/www.graphcore.ai/privacy__;!!JmoZiZGBv3RvKRSx!u8dXPx858KRkn3NJFFUKY46ZVBaBOz9jKGaTk7iC6v9IhpabzCjCnB1FRnf98j8GxQ$" target="_blank">
here</a> **<br>
<br>
This email and its attachments are intended solely for the addressed recipients and may contain confidential or legally privileged information.<br>
If you are not the intended recipient you must not copy, distribute or disseminate this email in any way; to do so may be unlawful.<br>
<br>
Any personal data/special category personal data herein are processed in accordance with UK data protection legislation.<br>
All associated feasible security measures are in place. Further details are available from the Privacy Notice on the website and/or from the Company.<br>
<br>
Graphcore Limited (registered in England and Wales with registration number 10185006) is registered at 107 Cheapside, London, UK, EC2V 6DN.<br>
This message was scanned for viruses upon transmission. However Graphcore accepts no liability for any such transmission.</span><u></u><u></u></p>
</div>
</div>
</div>
</blockquote></div>