[cfe-dev] RFC: Easier AST Matching by Default

Manuel Klimek via cfe-dev cfe-dev at lists.llvm.org
Mon Jun 22 02:20:04 PDT 2020


Pulling in more folks who had opinions in the original decision.
(y'all let me know if I misrepresented somebody's stance here)

Matt was the strongest one in favor of the new default mode, I believe, and
he argued that this mode made several of his matchers more precise that
were previously basically buggy, so I'd especially be interested whether
your view changed based on this thread, Matt.

Sam I believe raised a similar idea as has been brought up in this thread,
that if we want a syntactic matching mode, we should wait for SyntaxTrees
to land before making matchers that work on SyntaxTrees.

Yitzhak was somewhere in between iirc.

Dmitri iiuc mainly would have liked a more thorough analysis of the
possible actions before we switch.

I personally was convinced about the new default mode by Matt's apparent
excitement, as well as the data from public clang-tidy matchers. I do think
that I made a mistake by not pulling in the right set of folks earlier,
though, for which I'm sorry :(

On Mon, Jun 22, 2020 at 8:52 AM Richard Smith via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> On Sun, 21 Jun 2020 at 13:51, Stephen Kelly via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> On 21/06/2020 19:59, Richard Smith wrote:
>>
>> I think if we want to expose a syntactic mode, we should do that with a
>> set of syntactic matchers (eg, a matcher that matches parenthesized
>> initialization). Suppose someone wants to match all direct initialization.
>> Right now, they need to do lots of checks: for a non-list, non-implicit
>> cxxConstructExpr, for a varDecl whose initialization kind is direct init,
>> for a cxxTemporaryObjectExpr, for a functionalCastExpr, and there'll likely
>> be other kinds that I forgot and more added in the future.
>>
>>
>> I agree there are missing matchers. I'd like to add them, but I don't
>> think that's enough to make the matchers framework easier to use for
>> newcomers.
>>
>>
>>
>> I don't think it's intuitive to call IgnoreUnlessSpelledInSource mode
>> Syntactic, because it isn't really that -- it doesn't let you match syntax,
>> it lets you match semantics-with-associated-tokens (and even that is only
>> an approximate description).
>>
>>
>> Ok.
>>
>>
>> I think it would be really interesting to add a way to actually match
>> syntax in a semantics-free way, but it seems like a big project.
>>
>> Also the mode you're calling Semantic is also not a semantic match,
>> because it also matches syntax-only nodes. (It doesn't implicitly
>> IgnoreParens or anything like that.)
>>
>> That said, I think neither Semantic nor Syntactic is the right default.
>> By default we should be assuming that matchers want to match a combination
>> of syntax and semantics -- usually a matcher will want to key off some
>> semantic effect obtained in a particular syntactic context.
>>
>>
>> I'm not certain I agree about that, or that it is what a newcomer wants,
>> but ok.
>>
>>
>> I think the best refinement for now would be to restore the
>>>> CXXConstructExpr in the case of a varDecl initializer, if that is possible
>>>> (may not be).
>>>>
>>> I think that is addressing a symptom rather than the cause. I think the
>>> root cause is that a matcher that is explicitly asking to match a certain
>>> implicit (or sometimes-implicit) AST node does not match.
>>>
>>>
>>> That is the purpose of AsIs/Semantic mode if I understood you correctly.
>>>
>>> The intention of the IgnoreUnlessSpelledInSource/Syntactic node is to
>>> *not* match certain implicit (or sometimes-implicit) AST nodes.
>>>
>>>
>>> For example, I would expect that implicitCastExpr() *never* matches
>>> under the new default behavior.
>>>
>>>
>>> That is correct.
>>>
>>>
>>> And I think that users, and especially beginner users, will see that as
>>> being simply broken -- if someone tries to write an implicitCastExpr()
>>> matcher, it's obvious that they want to match implicit casts, and we are
>>> not doing the user any favors by making that matcher not match by default.
>>>
>>>
>>> Hmm, if someone is the kind of newcomer that they've never encountered a
>>> CallExpr, FunctionDecl or any other AST node in their life before, I don't
>>> see why implicitCastExpr() would be the first, or one of the first, things
>>> they try.
>>>
>> Perhaps because they want to match an implicit cast, and they find it in
>> the documentation. Perhaps because they read one of the guides that says
>> "look at the AST dump and write matchers to match what you see there".
>>
>>
>> I'm not sure which guide says that, but perhaps it should be updated to
>> point people at clang-query. At least the guide I wrote does that:
>> https://devblogs.microsoft.com/cppblog/exploring-clang-tooling-part-2-examining-the-clang-ast-with-clang-query/
>>
>>
>> There may be different/multiple levels of newcomer that we each have in
>>> mind here.
>>>
>>> If someone is a kind of newcomer who wants to match on an
>>> implicitCastExpr(), then they're probably also the kind of newcomer who
>>> knows that's an implicit node and they should use Semantic node instead of
>>> Syntactic mode. Do you agree?
>>>
>> No. I think it's bad API design to have any mode in which
>> implicitCastExpr compiles but doesn't ever match.
>>
>>
>> Hmm, yes. Perhaps if the IgnoreUnlessSpelledInSource mode survives it
>> should reject a matcher like that.
>>
>> ----
>> You've suggested a different behavior for matchers which I don't think
>> anyone is working on (the design of or the implementation of).
>>
>> I continue to think the current behavior is sufficiently motivated by the
>> examples in the RFC.
>>
>> But, there's still tension about it.
>>
>> So, where to from here?
>>
>> Does the default have to be changed back to AsIs? Does
>> IgnoreUnlessSpelledInSource have to be removed? Does the traverse() matcher
>> have to be removed?
>>
> I would like to hear the opinions of others on these questions. I think
> we've both described our perspectives and made our cases.
>
>
> Just expanding on my position from earlier in this thread slightly: to
> directly address the "where to from here?" question, I would like us to get
> back to the state where, *in our default mode, the matcher for AST node X
> is always able to match all instances of AST node X*. I think there are
> various options for how we get there that seem acceptable, and that don't
> sacrifice your noble goals of making AST matching easier and more
> intuitive. In particular, I think it's fine (and probably good) if by
> default we look through non-matching implicit nodes while looking for a
> matching node (so long as we don't look past a matching node just because
> it's implicit). And I think it's fine (and probably good) to expose an easy
> way to explicitly control whether we automatically look through implicit
> nodes or not. But I think if no-one is prepared to do the work to make our
> default mode satisfy the above property while still looking through
> implicit nodes if (and only if) they don't match, then I think we should
> change the default back to the state we had before. (I don't have much of
> an opinion on whether to keep or remove 'traverse' and its various modes,
> other than that we've already caused a substantial amount of churn with the
> changes thus far, and removing them again would cause further churn.)
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200622/04d2432d/attachment-0001.html>


More information about the cfe-dev mailing list