[cfe-dev] RFC: Easier AST Matching by Default

Fri Jan 17 03:41:54 PST 2020

On 13/01/2020 16:39, Dmitri Gribenko wrote:
> On Fri, Jan 10, 2020 at 4:34 PM Aaron Ballman via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     I've not seen much to suggest the community is not in favor of this
>     direction, so I think you're all set to post patches for review. If
>     there are community concerns, we can address them during the review
>     process. Thank you for working on this!
>
>
> Sorry, I just came back from vacation. I'm not sure if it is a good 
> idea, in particular, for the following reasons:
>
> - The document specifies the goals in terms of a solution: "Goal: 
> Users should not have to account for implicit AST nodes when writing 
> AST Matchers".

The goal is to make AST Matchers more accessible/easy to use for 
newcomers/non-experts in the clang AST by no-longer confronting them 
with 100% of the complexity on first use.

I think it's important to bridge that gap. See my EuroLLVM talk for more.

> - The complexity for implementers (although it also applies to 
> https://reviews.llvm.org/D61837, which landed a while ago, and I 
> didn't see it). Allowing to run the same matcher in multiple modes 
> increases our support and testing matrix.

I don't think it increases the test matrix more than any other feature. 
Can you expand on this?

>
> - The complexity cliff for users. Many non-trivial matchers need to 
> explicitly manage implicit AST nodes.

You just described the complexity cliff which is (after my change) not 
the first thing a newcomer hits.

It's good to move that cliff back so that the newcomer is not confronted 
with it immediately.

>
> - The cost of ClangTidy. ClangTidy runs all matchers simultaneously in 
> one AST traversal. If we implement implicit/no-implicit modes as 
> separate AST traversals, ClangTidy would need to run two traversals, 
> one for "easier" matchers (skipping implicit code), and one for 
> "expert" matchers (traversing implicit code). We would also need to 
> build two parent maps.

Will Syntax Trees ever be used in clang-tidy? Or will their use be 
forbidden in clang-tidy for the same reason of adding to a runtime cost 
when used in a mixture?

If your concern is valid it would seem to apply to Syntax Trees to a 
greater extent than this patch does.

That said, I think there is some scope for optimization in my approach. 
The optimizations I have in mind (being more efficient with parent maps) 
wouldn't apply to Syntax Trees though as that is a completely separate 
system.

>
> My best suggestion is to investigate implementing AST Matchers for 
> syntax trees, and allow jumping between syntax and semantic nodes in 
> one matcher -- to match syntax first, and then validate the necessary 
> semantic constraints.

My previous email quoted the Syntax Trees RFC saying that the first step 
is to "use [the current ASTMatchers API] to find interesting Clang AST 
nodes". You seem to be suggesting that the opposite would be done. Maybe 
things have evolved since that RFC, or maybe both approaches should work 
equally well.

As long as the current AST Matchers APIs exist, they should be improved. 
Perhaps they will be less important in the future when we have Syntax 
Trees for these use-cases instead. However, for now, I don't think the 
ease of use improvement in my patches should be prevented based 
existence of a future alternative system.

Thanks,

Stephen.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200117/891a7685/attachment.html>