[cfe-dev] [RFC] Dynamic AST Matcher library

Thu Apr 18 10:48:41 PDT 2013

I'd say that setting up a CL would be a next step; I'd propose to put it
into ASTMatchers, perhaps starting with a minimal implementation (I know
that you already have a more complete set, but it would probably be easier
to review if we can split that up somehow).

Cheers,
/Manuel

On Wed, Apr 17, 2013 at 10:33 PM, Samuel Benzaquen <sbenza at google.com>wrote:

> Pinging this thread. There is still interest on this on our side.
> I'll edit below with some comments.
> Thanks, Sam
>
> On Wed, Dec 5, 2012 at 12:09 PM, Samuel Benzaquen <sbenza at google.com>wrote:
>
>> Hello all,
>>
>> We have been working on a layer on top of the AST Matcher library to
>> support creating arbitrary matchers at runtime.
>> I have a proof of concept working on the tooling branch and wanted to
>> send a proposal to make it a real thing.
>> The proposal doc is on google docs at:
>>
>> https://docs.google.com/document/d/1H3queuLjEYWdgpHzo7Mj3ReBXJdpSI8vcH9Y2p_4Y-8/edit
>>
>> Inlined copy below for those that prefer it this way. Feel free to
>> comment on either.
>> _Sam
>> * Design: Dynamic AST MatchersThis document contains a design proposal
>> for a dynamic AST matcher layer that supports creating arbitrary matchers
>> at runtime, which will allow for generic tools to be developed.
>> ContextThe existing AST matcher framework provides a very powerful and
>> compact way to traverse the AST and find specific nodes by providing
>> complex predicates.
>> However, this framework requires that all predicates to be used at
>> runtime are fully defined at compile time. The creation of these predicates
>> can be a big challenge in some cases, and this limitation forces a
>> ‘recompile’ step on this creative loop.
>>
>> Also, without being able to specify matchers at runtime we can’t support
>> more generic search use cases. For example, an editor-based search tool
>> that supports an AST matcher predicates.
>> Goals
>>
>>    - Runtime parsing of matcher expressions, including literals and
>>    predicates. Feedback for syntax/semantic errors in the expressions
>>
>> *
>>
>
> The parser is just a tool to glue together the UI with the matching
> system. It is not a goal on itself.
> The simplified syntax (something in the line of S-expressions) keeps the
> parser simple. Also, it allows for other features to be built on top of the
> matcher generation, like:
>  - Inject Id() nodes automatically to extract more info about the matches.
> A web interface could use this information to color code the pieces, or add
> tooltips.
>  - Extract known submatchers that can facilitate indexing. Eg. any
> hasName() submatcher could be found and used to drop compilation units
> right away if they don't contain anything that has that name.
>
> *
>>
>>    - Create matchers at runtime by name, with dynamically created
>>    arguments if required by the matcher
>>    - Query command line tool that runs the matcher on a specific
>>    compilation unit and reports all matches
>>
>> *
>>
>
> This command line tool is both an example of how to use the system and a
> nice to have utility.
>
> Another tool that could be created on top, but is not part of this first
> proposal, would be a server that can load/parse/index/cache code and can
> respond to Matcher queries.
>
>> *
>>
>>
>> Non-goals
>>
>>    - Dynamic refactoring tools. They might be implemented using dynamic
>>    matchers, but it is out of the scope for this library
>>    - Full C++ expression support for predicates. Values are going to be
>>    limited to literals (eg. 1, “foo”). Expressions like 1+2 will not be
>>    parsed. Some useful constants (like INT_MAX) might be available
>>
>> *
>>
>
> As explained above, having a full C++ syntax would difficult the usage of
> the matcher expressions. Also, it might require JIT'n of the expression to
> actually support any C++ expression, which would make it a security issue.
>
>> *Code location The main output of the project will be a library to be
>> used by AST matcher related clang-tools, and should live close to the
>> already existing AST matcher library, but as a separate library
>> Matcher implementationOne aspect that simplifies the implementation of
>> dynamic matchers is to have a uniform generic signature for creating them.
>> Otherwise, whatever code that creates them will require explicit knowledge
>> of each matcher, or at least of each signature.
>> The generic signature is of the form VariantType (*)(const
>> std::vector<VariantType>& args). A variant type instance can contain
>> numbers, booleans, strings and Matchers, as well as error related
>> information for return values.
>> Instead of reimplementing all matchers to have a generic signature, we
>> implement marshaller functions that provide the generic signature on top of
>> the regular static one. The marshallers verify that the argument count and
>> types are correct and calls the underlying constructor function.
>> RegistryAt initialization of the framework, all known matchers are added
>> into a registry. This registry provides a way to create matchers at runtime
>> by name.
>> This registry is the only component that would require maintenance when
>> the ASTMatcher library is updated. Any new matcher, and new overloads for
>> existing matchers, would have to be added to the registry. In most cases, a
>> single line of code per matcher does the job.
>> Matcher expression parsingA simple matcher expression parser, combined
>> with the matcher registry, allows clients to go from a user provided string
>> to a Matcher<T>. The output of the parser is a variant type also, which
>> allows it to return Matchers as well as syntax and semantic errors.
>> The parser interface provides a way for the client to inject and/or
>> modify the matcher tree as it is being created. For example, a client could
>> insert Id() matcher around specific nodes without having to do it in the
>> string form.*
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130418/549b19f5/attachment.html>