[cfe-dev] RFC: Easier AST Matching by Default
David Rector via cfe-dev
cfe-dev at lists.llvm.org
Mon Jun 22 08:11:39 PDT 2020
> I would like to hear the opinions of others on these questions. I think
> we've both described our perspectives and made our cases.
> Just expanding on my position from earlier in this thread slightly: to
> directly address the "where to from here?" question, I would like us to get
> back to the state where, *in our default mode, the matcher for AST node X
> is always able to match all instances of AST node X*. I think there are
> various options for how we get there that seem acceptable, and that don't
> sacrifice your noble goals of making AST matching easier and more
> intuitive. In particular, I think it's fine (and probably good) if by
> default we look through non-matching implicit nodes while looking for a
> matching node (so long as we don't look past a matching node just because
> it's implicit). And I think it's fine (and probably good) to expose an easy
> way to explicitly control whether we automatically look through implicit
> nodes or not. But I think if no-one is prepared to do the work to make our
> default mode satisfy the above property while still looking through
> implicit nodes if (and only if) they don't match, then I think we should
> change the default back to the state we had before. (I don't have much of
> an opinion on whether to keep or remove 'traverse' and its various modes,
> other than that we've already caused a substantial amount of churn with the
> changes thus far, and removing them again would cause further churn.)
1. It would be helpful to rehash the examples one final time — give the code & the dump, a proposed matcher, then what option A would match, B, etc. Others should try to consider at each as would a beginner.
2. Perhaps this an opportunity to create a uniform interface to deal with implicit nodes in the AST. I believe any implicit node only ever has at most one child node. If so, consider this:
(a) Always have "Implicit*" versions of any implicit-able nodes (e.g. ImplicitCXXConstructExpr, in addition to CXXConstructExpr);
(c) Implement getAs<T>() for Decls and Stmts, the same we do for Types: e.g. E->getAs<CXXConstructExpr>() would skip over the ExprWCleanups, ImplicitCXXConstructExpr, etc. But you could still call E->getAs<ImplicitCXXConstructExpr>() too if desired.
Then, in the case of ASTMatchers, the matchers would always match the implicit nodes as before, but Results.Nodes.getNodeAs<…> would call getAs<…> on each actual result. I believe this aligns with the behavior you suggested Richard. But I have not fully thought this through, so I defer to others.
My perspective on ASTMatchers, FWIW: I never got into them, because I had begun learning the AST first and it seemed redundant since you could do the same stuff with RecursiveASTVisitor, and there seemed to be a lot of extra infrastructure you had to deal with. I now see that the matchers can be considerably tighter and clearer -- but that is the only advantage over doing it the full way with RecursiveASTVisitor, so whatever the solution is, it must keep or increase that clarity advantage while retaining as much of the flexibility of RecursiveASTVisitor as possible. E.g. it sounded like Billy, the guy who had issues with Stmt traversal order (owing to the queueing of child statements, instead of stack processing them as most would expect), decided to just use RecursiveASTVisitor instead of deal with the hassle — that is the outcome to avoid.
More information about the cfe-dev