[PATCH] D60392: FileCheck [12/12]: Support use of var defined on same line

Joel E. Denny via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 8 19:30:34 PST 2021


jdenny added a comment.

I understand the desire to be able to define and reference a variable within the same directive.  However, as I understand it, this patch adds that ability in a way that introduces a new subtlety to FileCheck: the semantics of numeric variable references are then not always consistent.  If you assume the wrong semantics, you can end up with a false pass of a test.  I understand you've disabled the case of `CHECK-NOT`, but it does not appear to be the only case that can lead to false passes.

Example
-------

Let's say we want to check that the very last value of y on an input line is equal to the first value of y plus 3.  Thus, the following input line is then erroneous:

  x=20, y=30, z=40, x=21, y=31, z=41, x=22, y=33, z=42, x=23, y=32, z=43,



Check version 1
---------------

  CHECK: y=[[#VAR:]],{{.*}}y=[[#VAR+3]],

The behavior is:

- `y=[[#VAR:]],` matches `y=30,`.
- `{{.*}}` is maximal munch.
- So `y=[[#VAR+3]],` matches the last `y=` expression, which is `y=32,`.

The directive then fails because `32` is not `30+3`.  OK, that's the exact behavior I was going for.  And now I think I understand how numeric variable references work.

(If the above is not the behavior you expect after applying this patch, let me know.  Every example I tried after applying this patch produced seg faults or assertion failures, so I couldn't verify.)

Check version 2
---------------

Let's say I later realize I'd rather use `VAR` as defined to `30` by some earlier directive, so I rewrite check version 1 as follows:

  CHECK: y=[[#VAR]],{{.*}}y=[[#VAR+3]],

The behavior is:

- `y=[[#VAR]],` is visibly differently than `y=[[#VAR:]],`, and its semantics are different too: it now matches a previously defined value instead of capturing a new value.  Nevertheless, it still matches `y=30,`.  So far, so good.
- `{{.*}}` is still maximal munch.
- Given that previous parts of the pattern match as before and `VAR` ends  up with the same value as before, shouldn't the semantics of `y=[[#VAR+3]],` be the same as before?  No.  It can only match `y=33,` now, and so that's what it does.

The directive then accepts the erroneous input.  That's pretty subtle.

Check version 3
---------------

Or, let's say I decide that splitting check version 1 into two directives would look better:

       CHECK: y=[[#VAR:]],
  CHECK-SAME: {{.*}}y=[[#VAR+3]],

That's equivalent to version 1, right?  Normally, I think it would be.  Not in this case.  Again, `y=[[#VAR+1]],` now has a different behavior: it only matches `y=33,`, and the erroneous input is now accepted.

Conclusion
----------

I'm leery of letting the semantics of numeric variable references vary depending on where the variable happens to have been defined.  It's too subtle.

I think it would be better to introduce the desired functionality with a different syntax to reflect the semantic shift.  For example:

  CHECK: y=[[#VAR1:]],{{.*}}y=[[#VAR2:]],[[!#VAR1 == #VAR2]]

That introduces the concept of a post-match variable assertion.  It would always appear at the end of a pattern to indicate it's evaluated last.  It would always be delimited by `[[!` and `]]`, which hopefully doesn't collide with any existing syntax.

There's probably a better syntax I haven't thought of.  My main point is to demonstrate how a different syntax could eliminate the inconsistent behavior I am concerned about.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60392/new/

https://reviews.llvm.org/D60392



More information about the llvm-commits mailing list