[cfe-dev] RFC: Easier AST Matching by Default

Stephen Kelly via cfe-dev cfe-dev at lists.llvm.org
Sun Jun 21 13:51:36 PDT 2020

On 21/06/2020 19:59, Richard Smith wrote:
> I think if we want to expose a syntactic mode, we should do that with 
> a set of syntactic matchers (eg, a matcher that matches parenthesized 
> initialization). Suppose someone wants to match all direct 
> initialization. Right now, they need to do lots of checks: for a 
> non-list, non-implicit cxxConstructExpr, for a varDecl whose 
> initialization kind is direct init, for a cxxTemporaryObjectExpr, for 
> a functionalCastExpr, and there'll likely be other kinds that I forgot 
> and more added in the future.

I agree there are missing matchers. I'd like to add them, but I don't 
think that's enough to make the matchers framework easier to use for 

> I don't think it's intuitive to call IgnoreUnlessSpelledInSource mode 
> Syntactic, because it isn't really that -- it doesn't let you match 
> syntax, it lets you match semantics-with-associated-tokens (and even 
> that is only an approximate description).


> I think it would be really interesting to add a way to actually match 
> syntax in a semantics-free way, but it seems like a big project.
> Also the mode you're calling Semantic is also not a semantic match, 
> because it also matches syntax-only nodes. (It doesn't implicitly 
> IgnoreParens or anything like that.)
> That said, I think neither Semantic nor Syntactic is the right 
> default. By default we should be assuming that matchers want to match 
> a combination of syntax and semantics -- usually a matcher will want 
> to key off some semantic effect obtained in a particular syntactic 
> context.

I'm not certain I agree about that, or that it is what a newcomer wants, 
but ok.

>>         I think the best refinement for now would be to restore the
>>         CXXConstructExpr in the case of a varDecl initializer, if
>>         that is possible (may not be).
>>     I think that is addressing a symptom rather than the cause. I
>>     think the root cause is that a matcher that is explicitly asking
>>     to match a certain implicit (or sometimes-implicit) AST node does
>>     not match.
>     That is the purpose of AsIs/Semantic mode if I understood you
>     correctly.
>     The intention of the IgnoreUnlessSpelledInSource/Syntactic node is
>     to *not* match certain implicit (or sometimes-implicit) AST nodes.
>>     For example, I would expect that implicitCastExpr() *never*
>>     matches under the new default behavior.
>     That is correct.
>>     And I think that users, and especially beginner users, will see
>>     that as being simply broken -- if someone tries to write an
>>     implicitCastExpr() matcher, it's obvious that they want to match
>>     implicit casts, and we are not doing the user any favors by
>>     making that matcher not match by default.
>     Hmm, if someone is the kind of newcomer that they've never
>     encountered a CallExpr, FunctionDecl or any other AST node in
>     their life before, I don't see why implicitCastExpr() would be the
>     first, or one of the first, things they try.
> Perhaps because they want to match an implicit cast, and they find it 
> in the documentation. Perhaps because they read one of the guides that 
> says "look at the AST dump and write matchers to match what you see 
> there".

I'm not sure which guide says that, but perhaps it should be updated to 
point people at clang-query. At least the guide I wrote does that: 

>     There may be different/multiple levels of newcomer that we each
>     have in mind here.
>     If someone is a kind of newcomer who wants to match on an
>     implicitCastExpr(), then they're probably also the kind of
>     newcomer who knows that's an implicit node and they should use
>     Semantic node instead of Syntactic mode. Do you agree?
> No. I think it's bad API design to have any mode in which 
> implicitCastExpr compiles but doesn't ever match.

Hmm, yes. Perhaps if the IgnoreUnlessSpelledInSource mode survives it 
should reject a matcher like that.


You've suggested a different behavior for matchers which I don't think 
anyone is working on (the design of or the implementation of).

I continue to think the current behavior is sufficiently motivated by 
the examples in the RFC.

But, there's still tension about it.

So, where to from here?

Does the default have to be changed back to AsIs? Does 
IgnoreUnlessSpelledInSource have to be removed? Does the traverse() 
matcher have to be removed?



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200621/137e117e/attachment-0001.html>

More information about the cfe-dev mailing list