[PATCH] D60392: FileCheck [12/12]: Support use of var defined on same line
Joel E. Denny via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 8 19:30:34 PST 2021
jdenny added a comment.
I understand the desire to be able to define and reference a variable within the same directive. However, as I understand it, this patch adds that ability in a way that introduces a new subtlety to FileCheck: the semantics of numeric variable references are then not always consistent. If you assume the wrong semantics, you can end up with a false pass of a test. I understand you've disabled the case of `CHECK-NOT`, but it does not appear to be the only case that can lead to false passes.
Example
-------
Let's say we want to check that the very last value of y on an input line is equal to the first value of y plus 3. Thus, the following input line is then erroneous:
x=20, y=30, z=40, x=21, y=31, z=41, x=22, y=33, z=42, x=23, y=32, z=43,
Check version 1
---------------
CHECK: y=[[#VAR:]],{{.*}}y=[[#VAR+3]],
The behavior is:
- `y=[[#VAR:]],` matches `y=30,`.
- `{{.*}}` is maximal munch.
- So `y=[[#VAR+3]],` matches the last `y=` expression, which is `y=32,`.
The directive then fails because `32` is not `30+3`. OK, that's the exact behavior I was going for. And now I think I understand how numeric variable references work.
(If the above is not the behavior you expect after applying this patch, let me know. Every example I tried after applying this patch produced seg faults or assertion failures, so I couldn't verify.)
Check version 2
---------------
Let's say I later realize I'd rather use `VAR` as defined to `30` by some earlier directive, so I rewrite check version 1 as follows:
CHECK: y=[[#VAR]],{{.*}}y=[[#VAR+3]],
The behavior is:
- `y=[[#VAR]],` is visibly differently than `y=[[#VAR:]],`, and its semantics are different too: it now matches a previously defined value instead of capturing a new value. Nevertheless, it still matches `y=30,`. So far, so good.
- `{{.*}}` is still maximal munch.
- Given that previous parts of the pattern match as before and `VAR` ends up with the same value as before, shouldn't the semantics of `y=[[#VAR+3]],` be the same as before? No. It can only match `y=33,` now, and so that's what it does.
The directive then accepts the erroneous input. That's pretty subtle.
Check version 3
---------------
Or, let's say I decide that splitting check version 1 into two directives would look better:
CHECK: y=[[#VAR:]],
CHECK-SAME: {{.*}}y=[[#VAR+3]],
That's equivalent to version 1, right? Normally, I think it would be. Not in this case. Again, `y=[[#VAR+1]],` now has a different behavior: it only matches `y=33,`, and the erroneous input is now accepted.
Conclusion
----------
I'm leery of letting the semantics of numeric variable references vary depending on where the variable happens to have been defined. It's too subtle.
I think it would be better to introduce the desired functionality with a different syntax to reflect the semantic shift. For example:
CHECK: y=[[#VAR1:]],{{.*}}y=[[#VAR2:]],[[!#VAR1 == #VAR2]]
That introduces the concept of a post-match variable assertion. It would always appear at the end of a pattern to indicate it's evaluated last. It would always be delimited by `[[!` and `]]`, which hopefully doesn't collide with any existing syntax.
There's probably a better syntax I haven't thought of. My main point is to demonstrate how a different syntax could eliminate the inconsistent behavior I am concerned about.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D60392/new/
https://reviews.llvm.org/D60392
More information about the llvm-commits
mailing list