[PATCH] D77227: [RFC][FileCheck] Require colon immediately after CHECK directives

Paul Robinson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 2 15:11:46 PDT 2020


probinson added a comment.

In D77227#1957603 <https://reviews.llvm.org/D77227#1957603>, @jdenny wrote:

> In D77227#1957332 <https://reviews.llvm.org/D77227#1957332>, @probinson wrote:
>
> > I believe where we're headed is that the (whitespace or ending with `[#/;*!]`) rule is how we'll identify directives that we expect the user *intended* to be directives for FileCheck to act on.  If we then find syntactic errors (multiple suffixes, missing colon) we'll report those as errors.
>
>
> I'm not sure if I'm reading this correctly.  Which of the following do you intend?
>
> A. This new rule constrains where FileCheck recognizes malformed directives for diagnostics.
>
> B. This new rule constrains where FileCheck recognizes both (1) malformed directives for diagnostics and (2) well formed directives for execution.
>
> I thought we were planning A.
>
> I think the only change from A to B is to (at least theoretically) create a new class of malformed directives that won't be executed, and they happen to also be ones we cannot diagnose under this new rule.  We would need the mode you describe below to diagnose these, but those diagnostics would apparently be plagued by false positives.


Ah.  My idea was to constrain executable directives, on the theory that the existing population of actual directives that DON'T follow the (whitespace or comment character) rule is minuscule, and that it would be user-hostile to execute a directive in a context where we can't diagnose the same typo errors that we DO diagnose elsewhere.

Regarding the population of directives that don't follow the rule, I did some grepping:
`find llvm/test -type f -exec grep '\bCHECK:' \{} \; | grep -v '[#/;*!_ at -]\s*CHECK:' | grep -v '^\s*CHECK:' `
(I added `_ at -` because FOO-CHECK: and FOO_CHECK: show up a fair amount; and it seems `@` is sometimes an assembler comment character.)

This produced 37 lines of output.  Two are from binary files and one from a Python file.  27 are from FileCheck's own tests, and it looks like only one of those tests (line-count.txt) would have to be modified, because it uses lines such as
`17 CHECK: [[@LINE-3]] {{a}}aa`

That leaves 7 more lines, 
`llvm/test/tools/llvm-objdump/COFF/large-bss.test` has a leading colon.
`llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll` has 4 lines that use CHECK as the name of a FileCheck variable; while this blows my mind, it wouldn't actually be treated as a real directive or caught by the diagnostic.
`llvm/test/Transforms/[New]GVN/pr25440.ll` each have one line with `;%CHECK:`.

Repeating the exercise in `clang/test`, there are 10 matches; 6 are intended to "comment out" a CHECK: directive, two are another use of CHECK as a FileCheck variable, and the last two are of the form `// line #4: CHECK: yadayada`.

So, the population of genuine directives that don't follow the (whitespace or comment character) rule is extremely small, but not zero.

Do we try to find and fix them all, so that we can use the more restrictive rule?
Or do we fail to diagnose problems with directives that don't follow the rule, because that would introduce too many false positives?

Wondering what y'all think about that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77227/new/

https://reviews.llvm.org/D77227





More information about the llvm-commits mailing list