[PATCH] D53710: [FileCheck] Warn if a prefix is only used in LABEL checks

Thu Nov 8 08:11:18 PST 2018

jdenny added a comment.

In https://reviews.llvm.org/D53710#1291486, @SjoerdMeijer wrote:

> Hi Paul and Joel, thanks for your help here, appreciate it! I thought this was
>  a nice little addition, low-hanging fruit, but it looks like it turns into
>  proper project! :-)

No problem.  However all this turns out, thank you for contributing!

> Also from our discussions, it looks like we have not just
>  one but a few problems. If I try to summarise them:

Thanks for that summary, which helps me to better see where we stand.

> 1. FileCheck: it is lacking diagnostics, not revealing all sorts of problems in tests. This is a very serious issue, because with this patch it took no time to find a long list of test issue, severe and minor, see next point.
> 2. Test issues: perhaps we can classify tests in this way, we have:
>   1. Broken tests: they don't do or test anything because the non-LABEL prefixes used in tests are "undefined": they do not occur on the FileCheck and --check-prefix command line. This can easily happen by making a typo, or refactoring of the tests, etc.

A nit on your wording: The "broken tests" do test something: the LABELs.  They just don't test everything they were intended to.

> 1. Dodgy tests: looks like we have "compile-only tests". They do check the given --check-prefix, but only the CHECK-LABEL variant and not the non-LABEL prefixes.

I'm not sure what you mean by "compile-only" here.  I think you're saying that "dodgy tests" are like "broken tests" with one exception: the author is //intentionally// testing only LABEL directives, and that's a dodgy use case because plain CHECK directives would work.

> 1. Reasonable tests: multiple RUN lines, each with multiple --check-prefixes: one prefix is used just for LABELs ("shared label prefix"), and other prefixes to test variations within each test functions.
> 
> I think the objective is to produce a diagnostic for the broken and dodgy
>  tests, but (obviously) don't produce false positives for the reasonable tests.

That was my interpretation as well.  But it's the "dodgy tests" where it seems we're still not quite connecting.  See below.

> 
> 
>>> In https://reviews.llvm.org/D53710#1290508, @probinson wrote:
>>>  This seems like a reasonable use-case to me.
>> 
>> Agreed, but this test case shouldn't produce the warning because there's always an active prefix that has a non-LABEL directive. Or did I miss something?
> 
> I think/hope we were talking about the same things. Paul's previous example is a
>  reasonable use-case, and we shouldn't produce a warning. A more minimal example, a
>  test that should be added to this patch, is this one:
> 
>   // RUN: FileCheck -vv -check-prefix=SHARED -check-prefix=FOO -input-file %s %s 2>&1 | FileCheck -check-prefix=DONTWARN %s
>   // RUN: FileCheck -vv -check-prefix=SHARED -check-prefix=BAR -input-file %s %s 2>&1 | FileCheck -check-prefix=DONTWARN %s
>   ; This is a very normal test, so we shouldn't complain about it:
>   ;
>   ; DONTWARN-NOT: warning: {{.*}} only occurs in a LABEL check, this is probably not what you want.
>   foo
>   bar
>   ; SHARED-LABEL: foo
>   ; FOO:          bar
>    
>    
> 
> But this test will fail.

Agreed.

> This is (another) example of a false positive, and that's
>  what we were talking about I think.

Why?  The second RUN command's first FileCheck call almost surely demonstrates a "broken test" because the BAR prefix is specified but never appears in the file, leaving only LABEL directives.  At best, if it was somehow intentional that BAR was specified but never used in a directive , this FileCheck call represents a "dodgy test".  You said above that "dodgy tests" and "broken tests" should have diagnostics and are not false positives.

Below is another potential use case I tried to describe many comments ago, but I never gave an example:

  // RUN: testprog 0 | FileCheck -check-prefix=SHARED %s
  // RUN: testprog 1 | FileCheck -check-prefix=SHARED -check-prefix=CODE1 %s
  // RUN: testprog 2 | FileCheck -check-prefix=SHARED -check-prefix=CODE2 %s

  ; SHARED-LABEL: fnA
  ; CODE1:  body1
  ; CODE2:  body2

  ; SHARED-LABEL: fnB
  ; CODE1:  body1
  ; CODE2:  body2

The program under test is testprog.  The second and third RUN commands call testprog with different arguments producing different code within each label.  These RUN commands represent "reasonable tests", right?

However, the first RUN command calls testprog with an argument that produces the labels, but it doesn't produce any test-worthy code within each label.  This RUN command represents a "dodgy test", right?  But, in the context of the full test program, it's perfectly reasonable.  That is, for the first RUN command only, we could have used "SHARED:" and not "SHARED-LABEL:", but we don't do that because we want to use "SHARED-LABEL:" for the other RUN commands.

So this is a hybrid of a "reasonable test" and a "dodgy test", and the result is something that seems reasonable.  To be clear, I have not yet seen an example of this hybrid in any real test (including the tests discussed in this thread), and its possibility is one of the reasons I requested that more test suites be run with this patch.

The more I think about all this, the more doubtful I am about the direction we're heading.  The "dodgy tests" are often not really wrong because they are testing what the author intended.  They just represent an odd use of CHECK-LABEL because CHECK would be sufficient... except in the hybrid scenario I described.  So the "dodgy tests" are potentially wrong only in certain cases... and possibly only if you're squinting your eyes really hard because you really want this warning not to have false positives.  :-)  I think that's what I've been doing until now.

The rename of LABEL to BOUND seems like a lot of work just to avoid these "dodgy tests" that might not be that dodgy after all.  Does it help to avoid some other case we haven't discussed yet?

I agree it's easy to make mistakes with FileCheck, and I too would like to find ways to avoid those.  I'm just not convinced this direction is going to prove fruitful, but please tell me if I'm not seeing this clearly.

https://reviews.llvm.org/D53710