recordDecl() AST matcher

Mon Sep 14 14:47:02 PDT 2015

Btw, I think generating them, potentially into several different headers to
work around the compile time issue isn't such a bad idea.

On Mon, Sep 14, 2015 at 11:45 PM, Manuel Klimek <klimek at google.com> wrote:

> Feel free to rename the AST nodes :)
>
> On Mon, Sep 14, 2015, 2:44 PM Daniel Jasper <djasper at google.com> wrote:
>
>> Ok. I am happy with this then.
>>
>> (Just personally grumpy having to write
>> cxxRecordDecl(has(cxxConstructorDecl(..))) in the future ;-) ).
>>
>> On Mon, Sep 14, 2015 at 11:41 PM, Manuel Klimek <klimek at google.com>
>> wrote:
>>
>>>
>>>
>>> On Mon, Sep 14, 2015 at 2:29 PM Aaron Ballman <aaron at aaronballman.com>
>>> wrote:
>>>
>>>> On Mon, Sep 14, 2015 at 4:38 PM, Manuel Klimek <klimek at google.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > On Mon, Sep 14, 2015 at 12:26 PM Aaron Ballman <
>>>> aaron at aaronballman.com>
>>>> > wrote:
>>>> >>
>>>> >> On Mon, Sep 14, 2015 at 2:45 PM, Daniel Jasper <djasper at google.com>
>>>> wrote:
>>>> >> > By this point, I see that change might be profitable overall.
>>>> However,
>>>> >> > lets
>>>> >> > completely map this out. Changing just cxxRecordDecl() can actually
>>>> >> > increase
>>>> >> > confusion in other areas. Right now, not a single AST matcher has
>>>> the
>>>> >> > cxx
>>>> >> > prefix (although a total of 28 stand for the corresponding CXX..
>>>> AST
>>>> >> > node).
>>>> >> > This is consistent and people knowing this will never try to write
>>>> >> > cxxConstructExpr(). As soon as people have used cxxRecordDecl(),
>>>> the
>>>> >> > chance
>>>> >> > of them trying cxxConstructExpr() increases. You have spent a long
>>>> time
>>>> >> > figuring out that recordDecl means cxxRecordDecl(), which is one
>>>> >> > datapoint,
>>>> >> > but I am not aware of anyone else having this specific issue. And
>>>> we
>>>> >> > could
>>>> >> > make this less bad with better documentation, I think.
>>>> >> >
>>>> >> > So, for me, the questions are:
>>>> >> > 1) Do we want/need this change?
>>>> >>
>>>> >> We definitely need *a* change because there currently is no way to
>>>> >> match a C struct or union when compiling in C mode. I discovered this
>>>> >> because I was trying to write a new checker for clang-tidy that
>>>> >> focuses on C code and it would fail to match when compiling in C
>>>> mode.
>>>> >> Whether we decide to go with cxxRecordDecl vs recordDecl vs
>>>> structDecl
>>>> >> (etc) is less important to me than the ability to write clang-tidy
>>>> >> checks for C code.
>>>> >>
>>>> >> > 2) Do we want to be consistent and change the other 27 matchers as
>>>> well?
>>>> >>
>>>> >> I'm on the fence about this for all the reasons you point out.
>>>> >>
>>>> >> > A fundamental question is whether we want AST matchers to match AST
>>>> >> > nodes
>>>> >> > 1:1 or whether they should be an abstraction from some
>>>> implementation
>>>> >> > details of the AST.
>>>> >>
>>>> >> I absolutely agree that this is a fundamental question. I think a
>>>> >> higher priority fundamental question that goes along with it is: are
>>>> >> we okay with breaking a lot of user code (are these meant to be
>>>> stable
>>>> >> APIs like the LLVM C APIs)? If we want these APIs to be stable, that
>>>> >> changes the answer of what kind of mapping we can have.
>>>> >
>>>> >
>>>> > I think the AST matchers are so closely coupled to the AST that it
>>>> trying to
>>>> > be more stable than the AST doesn't help. Basically all uses of AST
>>>> matchers
>>>> > do something with the AST nodes afterwards, which will break anyway.
>>>>
>>>> I can get behind that logic. So we're okay with breaking their code
>>>> because there's no way around it -- it's tied to the AST, so users
>>>> cannot rely on the AST APIs remaining the same from release to release
>>>> anyway.
>>>>
>>>
>>> We might even *want* the code to break, as the use of the AST node might
>>> now be incorrect on a semantic level.
>>>
>>>
>>>>
>>>> >
>>>> >>
>>>> >> > And this is not an easy question to answer. There are
>>>> >> > many places where we don't follow a strict 1:1 mapping. Mostly node
>>>> >> > matchers, but also in traversal matchers, e.g. isDerivedFrom().
>>>> >> >
>>>> >> > Personally, I'd really hate to have the cxx Prefix everywhere, but
>>>> >> > that's
>>>> >> > just my personal opinion. I would even prefer matchers like
>>>> record() and
>>>> >> > method(), but I think somebody convinced me that that would be a
>>>> very
>>>> >> > bad
>>>> >> > idea ;-).
>>>> >>
>>>> >> My personal opinion is that (1) breaking code is fine, but we should
>>>> >> avoid doing it without very clear benefit, and (2) the mapping
>>>> between
>>>> >> AST node identifiers and AST matcher identifiers needs to be
>>>> >> incredibly obvious, but perhaps not slavishly 1:1. If we instead
>>>> >> decide we want a 1:1 mapping, then I think we need to start seriously
>>>> >> considering auto-generating the AST node (and type) matchers from
>>>> >> tablegen so that the AST nodes *cannot* get out of sync with the AST
>>>> >> matchers, otherwise we'll be right back here again in a few years
>>>> when
>>>> >> we modify the name of an AST node.
>>>> >
>>>> >
>>>> > I do think we want to auto-generate the matchers, but I don't think
>>>> tablegen
>>>> > is the right approach (I think an ast-matcher based tool is ;)
>>>> > That said, auto-generating all the matchers is a) a lot of effort and
>>>> b) the
>>>> > code-size and compile time of matchers already matters, so it's
>>>> unclear
>>>> > which ones we would want to generate, especially for traversal
>>>> matchers :(
>>>>
>>>> Oh, that's an excellent point (I'm talking about (b), I already knew
>>>> (a) was a lot of work). Thank you for pointing that out!
>>>>
>>>> >
>>>> >>
>>>> >> My definition of "incredibly obvious" is: if the AST has a prefixed
>>>> >> and unprefixed version, or two different prefixes, we should mimic
>>>> >> that directly with the matchers. Otherwise, existing AST matchers
>>>> >> without prefix shenanigans can remain as they are, and new AST
>>>> >> matchers should prefix as-required. If we decide we're okay breaking
>>>> >> code, then I don't see a problem with changing ctorInitializer() into
>>>> >> cxxCtorInitializer() when C adds constructors. ;-)
>>>> >
>>>> >
>>>> > I think the main things is cost for developers who try to write
>>>> matchers and
>>>> > work from the -ast-dump. Figuring out that there *is* a matcher with
>>>> an
>>>> > unprefixed node can take a while.
>>>>
>>>> Hmm, yes, but "take a while" should be relatively short, I would
>>>> think. In that use-case, the user does an -ast-dump, sees
>>>> "CXXFrobbleGnasher", they go to the AST matcher reference and they
>>>> search for "CXXFrobberGnasher." The first hit won't be
>>>> cxxFrobbleGnasher, but the entry for frobbleGnasher (which is still
>>>> the first hit when searching from the top of the document due to the
>>>> way we position node matchers) will have a parameter of
>>>> CXXFrobbleGnasher, so they will find still get to the right matcher on
>>>> the first hit. If someone doesn't read the documentation at all,
>>>> they're going to try cxxFrobbleGnasher() and get a compile error/no
>>>> known matcher. Then they'll go look at ASTMatchers.h and figure out
>>>> it's called frobbleGnasher by searching there instead of the
>>>> documentation.
>>>>
>>>
>>> The problem is that I've learned that sometimes people try to make
>>> things work in ways that I couldn't even imagine, and they lose more time
>>> than I could ever imagine them using :) Also, I agree the time is probably
>>> on average not that large, but we pay it over a long time in the future,
>>> and it tends to add up.
>>>
>>>
>>>> That's compared to having the matcher name always be the same as the
>>>> AST node, where the user writes cxxFrobbleGnasher and it just works,
>>>> which is definitely a mark in favor of making everything consistent. I
>>>> just don't think the current approach is too onerous in the case where
>>>> the matcher is at least *provided* for the user with a relatively sane
>>>> name.
>>>>
>>>> >> I should be clear, I'm not opposed to just having a 1:1 mapping. I'm
>>>> >> just not certain the benefits of being strict about that outweigh the
>>>> >> costs to broken code. cxxCtorInitializer will break someone's code,
>>>> >> but I don't think it adds any clarity to their code, so I don't see
>>>> >> the benefit of forcing the change.
>>>> >
>>>> > Well, I think there's the cost of broken code *once* now, vs. the
>>>> (smaller)
>>>> > cost for users in all future.
>>>> > I'm still strongly in favor of breaking now, and having a simpler
>>>> model
>>>> > going forward.
>>>>
>>>> I'm definitely in favor of breaking now in the case of RecordDecl vs
>>>> CXXRecordDecl. I think having recordDecl match CXXRecordDecl is a bug
>>>> given that there's no way to match a RecordDecl.
>>>>
>>>> I would also be totally in favor of being consistent if we were
>>>> starting from scratch. I'm very, very weakly opposed to breaking more
>>>> user's code than we have to in order to get usable matchers because it
>>>> seems gratuitous. Breaking code to get something that works seems
>>>> reasonable. Breaking code that already works just to change the name
>>>> for consistency elsewhere, I'm a bit less keen on. But the fact that
>>>> we already can break user's code at-will because of the reliance on
>>>> the AST nodes makes me think it may be the right approach for the best
>>>> API, since that's what I would want if we were starting from scratch.
>>>>
>>>> Okay, I'm convinced. I think we should rename the type and node
>>>> matchers (not traversal and narrowing matchers) to match the AST node
>>>> names in all cases. We can document the breakage in the release notes,
>>>> but (hopefully) only have to do this dance one time instead of
>>>> spreading the pain out as it happens to eventually get to the same
>>>> place anyway.
>>>>
>>>
>>> Yea, people who want more stability do use releases anyway.
>>>
>>>
>>>> Daniel, is this something you would be okay with? (I'm assuming
>>>> Richard finds it acceptable based on previous comments from Manuel,
>>>> but Richard, feel free to chime in.)
>>>>
>>>
>>> Offline conversation with Richard says that he is convinced.
>>>
>>>
>>>>
>>>> ~Aaron
>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20150914/8e898fe5/attachment.html>