[cfe-dev] RFC: AST Matcher Sub-expression references implementation straw man

Thu Apr 18 06:12:03 PDT 2013

On Thu, Apr 18, 2013 at 3:00 PM, Vane, Edwin <edwin.vane at intel.com> wrote:

>  If we don’t have to worry about unresolved sub-expression references
> then the implementation of a sub-ex ref matcher is pretty straight-forward.
> However, the ‘look up node in the BoundNodesTree’ statement is where all
> the trickiness is. Using existing machinery to do this could be
> computationally expensive. I’m going to do a bit of profiling to check this
> out. I’ve seen some mention of profiling stubs in the code already. Are
> there any tips or best known methods I should be aware of?
>

Not really - once you have something, I'll patch it in and test it over our
100MLOC code base and see how bad it is ;)
Apart from that, just use a sufficiently large TU that takes multiple
seconds to parse and run over that... Declarations are where most of the
parsing time goes anyway :P...

> ****
>
> ** **
>
> *From:* Manuel Klimek [mailto:klimek at google.com]
> *Sent:* Wednesday, April 17, 2013 3:43 PM
> *To:* Vane, Edwin
> *Cc:* Daniel Jasper; Alexander Kornienko; Clang Dev List (
> cfe-dev at cs.uiuc.edu)
>
> *Subject:* Re: RFC: AST Matcher Sub-expression references implementation
> straw man****
>
> ** **
>
> On Wed, Apr 17, 2013 at 7:47 PM, Vane, Edwin <edwin.vane at intel.com> wrote:
> ****
>
>   ****
>
>  ****
>
> *From:* Manuel Klimek [mailto:klimek at google.com]
> *Sent:* Wednesday, April 17, 2013 3:45 AM
> *To:* Vane, Edwin; Daniel Jasper; Alexander Kornienko
> *Cc:* Clang Dev List (cfe-dev at cs.uiuc.edu)
> *Subject:* Re: RFC: AST Matcher Sub-expression references implementation
> straw man****
>
>  ****
>
> On Tue, Apr 16, 2013 at 8:47 PM, Manuel Klimek <klimek at google.com> wrote:*
> ***
>
>  +djasper, alexfh****
>
>  ****
>
> On Tue, Apr 16, 2013 at 7:15 PM, Vane, Edwin <edwin.vane at intel.com> wrote:
> ****
>
> Hi all,
>
> I've been looking at implementing sub-expression references in AST
> Matchers. I'd like to propose an implementation and get feedback,
> especially if I've overlooked something.
>
> ======
> Quick motivation first:
> The goal of sub-expression references is to write matchers that can refer
> to nodes tagged by the matcher and use these nodes in more tests. Consider
> the following simple example that wants to find variable declarations whose
> type exactly matches the initializer type (for simplicity, lets ignore all
> the implicit cast stuff that will exist around the initializer):
>
> varDecl(hasType(qualType(...).bind("declType")),
>                hasInitializer(hasType(sameAs("declType")))****
>
>   ****
>
> Minor point: I'd guess we'll need to specify the type here explicitly,
> like sameAs<QualType>("declType"). Also, I'm not very happy with the color
> of the bike shed called "sameAs" yet :)****
>
>  ****
>
> *[Edwin] *I had thought the type could be inferred from which matcher the
> sub-ex ref is passed to à la PolymorphicMatchWithParam*. Ambiguities would
> be solved the same way we currently have to resolve ambiguities in the
> polymorphic matchers; by explicitly inserting a dyncast matcher:****
>
> hasType(qualType(sameAs(“blah”)))****
>
>  ** **
>
> You're right...  ****
>
> ** **
>
>    I have no attachments to the name. I’m open to suggestions.****
>
>  ** **
>
> Daniel usually has good naming ideas :)****
>
>  ****
>
>     ****
>
>    )
>
> There's potential for other sub-expression reference matchers beyond the
> "sameAs()" matcher proposed by this example. Name comparisons for example.
> ======
>
> Proposed changes to implement sub-expression references:
> - In a sub-ex matcher, look-up in the provided BoundNodesTreeBuilder for
> the given id.
>   - If the id exists perform matcher logic, return truth value.
>   - If an id doesn't exist, record an unresolved sub-expression reference
> in the Builder and return true.
>     - Info recorded: id we're looking for and some sort of matcher
> "future": a callback or functor for executing matcher logic at a later time
> along with all necessary data matcher needs.
> - When control returns to the top level finder logic
> (MatchASTVisitor::match()) we tend to unresolved sub-ex refs. For each
> recorded unresolved sub-ex ref, do a search through the BoundNodesTree for
> a matching id. If one is found, execute matcher future. If the future
> returns false, the match fails and the Callback that would normally get
> called will not be. If no id is found, match also fails. If the matcher
> future returns true, proceed to calling Callback for the match. The
> handling of unresolved sub-ex refs could be handled by modifying
> MatchVisitor.****
>
>      ****
>
> I'm currently opposed to unresolved subexpression matcher handling (but
> happy to be convinced otherwise).****
>
> Currently, we have:****
>
> a) a defined order of execution****
>
> b) a well-defined cut-off for logical expressions****
>
>  ****
>
> That is, if you say anyOf(a(), b()), b() will not be matched if a()
> matches. Now a() could contain a sub-expression matcher referring to
> something that might be bound in b(); how would you propose to solve this?
> ****
>
>  ****
>
> And on a more principled note:****
>
> Which things can we solve with unresolved subexpression matchers that
> cannot also be solved by re-arranging the matcher so that the bind() to the
> id must always happen before the sameAs match, otherwise sameAs will not
> match?****
>
>  ****
>
> *[Edwin] The unresolved sub-ex refs were meant to handle what I thought
> was an un-defined order of execution actually. But now that I think about
> it, the un-defined part is just the order of *creation* of matchers, not
> the calling of their match() functions. I think you’re right that we could
> just re-arrange a matcher so the tagging happens before the ref. Given
> that, I don’t think unresolved sub-ex refs are necessary. I don’t think
> it’s that big of onus on users of AST Matchers to organize their matchers
> this way.*****
>
>  ****
>
>    Outstanding questions:
> - Where should unresolved sub-ex refs be stored? Top-most level of the
> BoundNodesTree or just in the level at which the ref happens. Does it
> matter? Sub-ex refs should be able to refer to nodes anywhere else in the
> matcher and not have specific scope. Even if refs are stored at the current
> level, they should be collected and handled all at once when matching is
> complete.
>
> --
> Edwin Vane
>   Software Developer
>   Intel of Canada, Inc.****
>
>   ****
>
>   ****
>
>  ** **
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130418/53f5abab/attachment.html>