[all-commits] [llvm/llvm-project] e42cc3: [clang][test] add testing for the AST matcher refe...
Julian Schmidt via All-commits
all-commits at lists.llvm.org
Fri Sep 27 09:47:58 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: e42cc3f42e6a36f4dfff408278194040367e8259
https://github.com/llvm/llvm-project/commit/e42cc3f42e6a36f4dfff408278194040367e8259
Author: Julian Schmidt <git.julian.schmidt at gmail.com>
Date: 2024-09-27 (Fri, 27 Sep 2024)
Changed paths:
M clang/docs/LibASTMatchersReference.html
M clang/docs/ReleaseNotes.rst
M clang/docs/doxygen.cfg.in
M clang/docs/tools/dump_ast_matchers.py
M clang/include/clang/ASTMatchers/ASTMatchers.h
M clang/unittests/ASTMatchers/ASTMatchersTest.h
M clang/unittests/ASTMatchers/CMakeLists.txt
A clang/utils/generate_ast_matcher_doc_tests.py
Log Message:
-----------
[clang][test] add testing for the AST matcher reference (#110258)
## Problem Statement
Previously, the examples in the AST matcher reference, which gets
generated by the Doxygen comments in `ASTMatchers.h`, were untested and
best effort.
Some of the matchers had no or wrong examples of how to use the matcher.
## Solution
This patch introduces a simple DSL around Doxygen commands to enable
testing the AST matcher documentation in a way that should be relatively
easy to use.
In `ASTMatchers.h`, most matchers are documented with a Doxygen comment.
Most of these also have a code example that aims to show what the
matcher will match, given a matcher somewhere in the documentation text.
The way that the documentation is tested, is by using Doxygen's alias
feature to declare custom aliases. These aliases forward to
`<tt>text</tt>` (which is what Doxygen's `\c` does, but for multiple
words). Using the Doxygen aliases is the obvious choice, because there
are (now) four consumers:
- people reading the header/using signature help
- the Doxygen generated documentation
- the generated HTML AST matcher reference
- (new) the generated matcher tests
This patch rewrites/extends the documentation such that all matchers
have a documented example.
The new `generate_ast_matcher_doc_tests.py` script will warn on any
undocumented matchers (but not on matchers without a Doxygen comment)
and provides diagnostics and statistics about the matchers.
The current statistics emitted by the parser are:
```text
Statistics:
doxygen_blocks : 519
missing_tests : 10
skipped_objc : 42
code_snippets : 503
matches : 820
matchers : 580
tested_matchers : 574
none_type_matchers : 6
```
The tests are generated during building, and the script will only print
something if it found an issue with the specified tests (e.g., missing
tests).
## Description
DSL for generating the tests from documentation.
TLDR:
```
\header{a.h}
\endheader <- zero or more header
\code
int a = 42;
\endcode
\compile_args{-std=c++,c23-or-later} <- optional, the std flag supports std ranges and
whole languages
\matcher{expr()} <- one or more matchers in succession
\match{42} <- one or more matches in succession
\matcher{varDecl()} <- new matcher resets the context, the above
\match will not count for this new
matcher(-group)
\match{int a = 42} <- only applies to the previous matcher (not to the
previous case)
```
The above block can be repeated inside a Doxygen command for multiple
code examples for a single matcher.
The test generation script will only look for these annotations and
ignore anything else like `\c` or the sentences where these annotations
are embedded into: `The matcher \matcher{expr()} matches the number
\match{42}.`.
### Language Grammar
[] denotes an optional, and <> denotes user-input
```
compile_args j:= \compile_args{[<compile_arg>;]<compile_arg>}
matcher_tag_key ::= type
match_tag_key ::= type || std || count || sub
matcher_tags ::= [matcher_tag_key=<value>;]matcher_tag_key=<value>
match_tags ::= [match_tag_key=<value>;]match_tag_key=<value>
matcher ::= \matcher{[matcher_tags$]<matcher>}
matchers ::= [matcher] matcher
match ::= \match{[match_tags$]<match>}
matches ::= [match] match
case ::= matchers matches
cases ::= [case] case
header-block ::= \header{<name>} <code> \endheader
code-block ::= \code <code> \endcode
testcase ::= code-block [compile_args] cases
```
### Language Standard Versions
The 'std' tag and '\compile_args' support specifying a specific language
version, a whole language and all of its versions, and thresholds
(implies ranges). Multiple arguments are passed with a ',' separator.
For a language and version to execute a tested matcher, it has to match
the specified '\compile_args' for the code, and the 'std' tag for the
matcher. Predicates for the 'std' compiler flag are used with
disjunction between languages (e.g. 'c || c++') and conjunction for all
predicates specific to each language (e.g. 'c++11-or-later &&
c++23-or-earlier').
Examples:
- `c` all available versions of C
- `c++11` only C++11
- `c++11-or-later` C++11 or later
- `c++11-or-earlier` C++11 or earlier
- `c++11-or-later,c++23-or-earlier,c` all of C and C++ between 11 and
23 (inclusive)
- `c++11-23,c` same as above
### Tags
#### `type`:
**Match types** are used to select where the string that is used to
check if a node matches comes from.
Available: `code`, `name`, `typestr`, `typeofstr`. The default is
`code`.
- `code`: Forwards to `tooling::fixit::getText(...)` and should be the
preferred way to show what matches.
- `name`: Casts the match to a `NamedDecl` and returns the result of
`getNameAsString`. Useful when the matched AST node is not easy to spell
out (`code` type), e.g., namespaces or classes with many members.
- `typestr`: Returns the result of `QualType::getAsString` for the type
derived from `Type` (otherwise, if it is derived from `Decl`, recurses
with `Node->getTypeForDecl()`)
**Matcher types** are used to mark matchers as sub-matcher with 'sub' or
as deactivated using 'none'. Testing sub-matcher is not implemented.
#### `count`:
Specifying a 'count=n' on a match will result in a test that requires
that the specified match will be matched n times. Default is 1.
#### `std`:
A match allows specifying if it matches only in specific language
versions. This may be needed when the AST differs between language
versions.
#### `sub`:
The `sub` tag on a `\match` will indicate that the match is for a node
of a bound sub-matcher.
E.g., `\matcher{expr(expr().bind("inner"))}` has a sub-matcher that
binds to `inner`, which is the value for the `sub` tag of the expected
match for the sub-matcher `\match{sub=inner$...}`. Currently,
sub-matchers are not tested in any way.
### What if ...?
#### ... I want to add a matcher?
Add a Doxygen comment to the matcher with a code example, corresponding
matchers and matches, that shows what the matcher is supposed to do.
Specify the compile arguments/supported languages if required, and run
`ninja check-clang-unit` to test the documentation.
#### ... the example I wrote is wrong?
The test-failure output of the generated test file will provide
information about
- where the generated test file is located
- which line in `ASTMatcher.h` the example is from
- which matches were: found, not-(yet)-found, expected
- in case of an unexpected match: what the node looks like using the
different `type`s
- the language version and if the test ran with a windows `-target` flag
(also in failure summary)
#### ... I don't adhere to the required order of the syntax?
The script will diagnose any found issues, such as `matcher is missing
an example` with a `file:line:` prefix,
which should provide enough information about the issue.
#### ... the script diagnoses a false-positive issue with a Doxygen
comment?
It hopefully shouldn't, but if you, e.g., added some non-matcher code
and documented it with Doxygen, then the script will consider that as a
matcher documentation. As a result, the script will print that it
detected a mismatch between the actual and the expected number of
failures. If the diagnostic truly is a false-positive, change the
`expected_failure_statistics` at the top of the
`generate_ast_matcher_doc_tests.py` file.
Fixes #57607
Fixes #63748
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list