[PATCH] D140794: [ASTMatcher] Add coroutineBodyStmt matcher

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 10 08:23:55 PST 2023


aaron.ballman added inline comments.


================
Comment at: clang/include/clang/ASTMatchers/ASTMatchers.h:5503
+                                                          FunctionDecl,
+                                                          CoroutineBodyStmt),
                           internal::Matcher<Stmt>, InnerMatcher) {
----------------
ccotter wrote:
> aaron.ballman wrote:
> > ccotter wrote:
> > > aaron.ballman wrote:
> > > > I'm not certain it makes sense to me to add `CoroutineBodyStmt` to `hasBody` -- in this case, it doesn't *have* a body, it *is* the body.
> > > With respect to `hasBody()`, my intent was to treat the CoroutineBodyStmt node as analogous to a FunctionDecl or WhileStmt. WhileStmts have information like the loop condition expression, as CoroutineBodyStmts contain the synthesized expressions such as the initial suspend Stmt. Both WhileStmts and CoroutineBodyStmts then have the `getBody()` methods, usually a CompoundStmt for WhileStmts and either a CompoundStmt or TryStmt for CoroutineBodyStmts.
> > > 
> > > Ultimately, I wanted to be able to match the CoroutineBodyStmt's `function-body` (using the grammar) from the standard, e.g., `coroutineBodyStmt(hasBody(compoundStmt().bind(...)))`. If there is a different approach you'd recommend that's in line with the AST matcher design strategy, I can use that instead.
> > What concerns me about the approach you have is that it doesn't model the AST. A coroutine body statement is a special kind of function body, so it does not itself *have* a body. So this is a bit like adding `CompoundStmt` to the `hasBody` matcher in that regard.
> > 
> > To me, the correct way to surface this would be to write the matcher: `functionDecl(hasBody(coroutineBodyStmt()))` to get to the coroutine body, and if you want to bind to the compound statement, I'd imagine `functionDecl(hasBody(coroutineBodyStmt(has(compoundStmt().bind(...)))))` would be the correct approach. My thinking there is that we'd use the `has()` traversal matcher to go from the coroutine body to any of the extra information it tracks (the compound statement, the promise, allocate/deallocate, etc).
> > 
> > Pinging @klimek and @sammccall for additional opinions.
> I think I see. With my proposal, the match would be `functionDecl(hasBody(coroutineBodyStmt(hasBody(stmt().bind(...)))))`. For completeness, your suggestion would need `functionDecl(hasBody(coroutineBodyStmt(has(stmt(anyOf(cxxTryStmt(), compoundStmt()).bind(...))))))`.
> 
> I am a bit hung up though on two things, and clarifications on both would help solidify my understanding of the design philosophy of the matchers. First, since `CoroutineBodyStmt` has a `getBody()` method, I assumed it would make it eligible for the `hasBody` matcher, although perhaps I am making an incorrect assumption/generalization here?
> 
> Second, the `has()` approach seems less accurate since we would be relying on the fact that the other children nodes of `CoroutineBodyStmt` (initial or final suspend point, etc) would never be a `CompoundStmt` or `CXXTryStmt`, else we might get unintentional matches. Or, one might miss the CXXTryStmt corner case.
> 
> Follow up question - should a need arise (say, authoring many clang-tidy checks that need extensive coroutine matcher support on the initial suspend point etc), would we see the matchers supporting things like `coroutineBodyStmt(hasInitialSuspendPoint(...), hasFinalSuspendPoint(..))`, or rely on a combination of `has` approach / non-ASTMatcher logic (or locally defined ASTMatchers within the clang-tidy library)?
> 
> It looks like this phab can be broken down into two changes - first, the addition of a `coroutineBodyStmt` matcher, and second, a `hasBody` traversal matcher for `coroutineBodyStmt`. Happy to separate these out depending on the direction of the discussion.
> I am a bit hung up though on two things, and clarifications on both would help solidify my understanding of the design philosophy of the matchers. First, since CoroutineBodyStmt has a getBody() method, I assumed it would make it eligible for the hasBody matcher, although perhaps I am making an incorrect assumption/generalization here?

As I understand the evolution here, we started out with `hasBody()` to match on function declaration AST nodes with a body. It then evolved to include do/while/for (and eventually range-based for) loops, which is a bit of an odd decision (why not anything with a substatement, like `if` or `switch`?). Now we're looking to add support for the coroutine body node, but that has a half dozen+ different things we can traverse to, one of which is the compound statement for the coroutine body. All of these cases are kind of different from one another, so the design is a bit confusing (at least to me).

> Second, the has() approach seems less accurate since we would be relying on the fact that the other children nodes of CoroutineBodyStmt (initial or final suspend point, etc) would never be a CompoundStmt or CXXTryStmt, else we might get unintentional matches. Or, one might miss the CXXTryStmt corner case.

That's a fantastic point that I hadn't considered. We probably will want named traversal methods to go from the coroutine body statement to the other kinds of AST nodes it supports. And based on that, `hasBody()` is a quite reasonable name because that's one of the other kinds of AST nodes.

> Follow up question - should a need arise (say, authoring many clang-tidy checks that need extensive coroutine matcher support on the initial suspend point etc), would we see the matchers supporting things like coroutineBodyStmt(hasInitialSuspendPoint(...), hasFinalSuspendPoint(..)), or rely on a combination of has approach / non-ASTMatcher logic (or locally defined ASTMatchers within the clang-tidy library)?

I'm leaning more towards introducing new traversal matchers. In terms of what goes into ASTMatchers.h, we usually only add matchers once there are multiple in-tree needs for the matcher, otherwise we recommend using a locally defined matcher. ASTMatchers.h is quite expensive to compile due to all the template shenanigans, and so we've resisted adding matchers unless they're "needed" to try to keep compile times reasonable.

Hmmm, I'm coming around to reusing `hasBody()` like you've done here. I think it's pretty weird to read: `functionDecl(hasBody(coroutineBodyStmt(hasBody(stmt().bind(...)))` because of the duplicated calls to `hasBody`, but I think it's more defensible now than I was originally thinking. I'm still curious if other folks who care about AST matchers have an opinion given that this is a bit trickier than usual.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140794/new/

https://reviews.llvm.org/D140794



More information about the cfe-commits mailing list