<div dir="ltr">LG, ship it.</div><br><div class="gmail_quote"><div dir="ltr">On Wed, Sep 16, 2015 at 2:03 PM Aaron Ballman <<a href="mailto:aaron@aaronballman.com">aaron@aaronballman.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Attached is an updated patch for clang-tools-extra that does not have<br>

my in-progress, not-related-to-any-of-this code in it. ;-)<br>

<br>

~Aaron<br>

<br>

On Wed, Sep 16, 2015 at 4:04 PM, Aaron Ballman <<a href="mailto:aaron@aaronballman.com" target="_blank">aaron@aaronballman.com</a>> wrote:<br>

> Quick ping. I know this is a fairly gigantic patch, but I'm hoping for<br>

> a relatively quick review turnaround because of potential merge<br>

> conflicts with people doing a fair amount of work on clang-tidy<br>

> lately. Everything should be pretty straight-forward (it's all just<br>

> renames, no semantic changes intended aside from<br>

> recordDecl/cxxRecordDecl and the narrowing matchers.<br>

><br>

> ~Aaron<br>

><br>

> On Tue, Sep 15, 2015 at 1:32 PM, Aaron Ballman <<a href="mailto:aaron@aaronballman.com" target="_blank">aaron@aaronballman.com</a>> wrote:<br>

>> Here are the complete patches to solve the issues we've discussed:<br>

>><br>

>> 1) splits recordDecl into recordDecl and cxxRecordDecl<br>

>> 2) adds isStruct, isUnion, isClass to identify what kind of<br>

>> recordDecl() you may be looking at<br>

>> 3) prefixes all of the node matchers with cxx that should have it<br>

>> 4) fixes a similar issue with CUDAKernelCallExpr (the prefix needs to<br>

>> be cuda instead of CUDA to distinguish the matcher name from the type<br>

>> name).<br>

>> 5) updates all of the documentation and code that broke.<br>

>><br>

>> One patch is for changes to clang, the other is for changes to<br>

>> clang-tools-extra.<br>

>><br>

>> ~Aaron<br>

>><br>

>> On Mon, Sep 14, 2015 at 5:49 PM, Aaron Ballman <<a href="mailto:aaron@aaronballman.com" target="_blank">aaron@aaronballman.com</a>> wrote:<br>

>>> On Mon, Sep 14, 2015 at 5:47 PM, Daniel Jasper <<a href="mailto:djasper@google.com" target="_blank">djasper@google.com</a>> wrote:<br>

>>>> Btw, I think generating them, potentially into several different headers to<br>

>>>> work around the compile time issue isn't such a bad idea.<br>

>>><br>

>>> I'm not going to start with this approach, but think it may be worth<br>

>>> exploring at some point. ;-)<br>

>>><br>

>>> ~Aaron<br>

>>><br>

>>>><br>

>>>> On Mon, Sep 14, 2015 at 11:45 PM, Manuel Klimek <<a href="mailto:klimek@google.com" target="_blank">klimek@google.com</a>> wrote:<br>

>>>>><br>

>>>>> Feel free to rename the AST nodes :)<br>

>>>>><br>

>>>>><br>

>>>>> On Mon, Sep 14, 2015, 2:44 PM Daniel Jasper <<a href="mailto:djasper@google.com" target="_blank">djasper@google.com</a>> wrote:<br>

>>>>>><br>

>>>>>> Ok. I am happy with this then.<br>

>>>>>><br>

>>>>>> (Just personally grumpy having to write<br>

>>>>>> cxxRecordDecl(has(cxxConstructorDecl(..))) in the future ;-) ).<br>

>>>>>><br>

>>>>>> On Mon, Sep 14, 2015 at 11:41 PM, Manuel Klimek <<a href="mailto:klimek@google.com" target="_blank">klimek@google.com</a>><br>

>>>>>> wrote:<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> On Mon, Sep 14, 2015 at 2:29 PM Aaron Ballman <<a href="mailto:aaron@aaronballman.com" target="_blank">aaron@aaronballman.com</a>><br>

>>>>>>> wrote:<br>

>>>>>>>><br>

>>>>>>>> On Mon, Sep 14, 2015 at 4:38 PM, Manuel Klimek <<a href="mailto:klimek@google.com" target="_blank">klimek@google.com</a>><br>

>>>>>>>> wrote:<br>

>>>>>>>> ><br>

>>>>>>>> ><br>

>>>>>>>> > On Mon, Sep 14, 2015 at 12:26 PM Aaron Ballman<br>

>>>>>>>> > <<a href="mailto:aaron@aaronballman.com" target="_blank">aaron@aaronballman.com</a>><br>

>>>>>>>> > wrote:<br>

>>>>>>>> >><br>

>>>>>>>> >> On Mon, Sep 14, 2015 at 2:45 PM, Daniel Jasper <<a href="mailto:djasper@google.com" target="_blank">djasper@google.com</a>><br>

>>>>>>>> >> wrote:<br>

>>>>>>>> >> > By this point, I see that change might be profitable overall.<br>

>>>>>>>> >> > However,<br>

>>>>>>>> >> > lets<br>

>>>>>>>> >> > completely map this out. Changing just cxxRecordDecl() can<br>

>>>>>>>> >> > actually<br>

>>>>>>>> >> > increase<br>

>>>>>>>> >> > confusion in other areas. Right now, not a single AST matcher has<br>

>>>>>>>> >> > the<br>

>>>>>>>> >> > cxx<br>

>>>>>>>> >> > prefix (although a total of 28 stand for the corresponding CXX..<br>

>>>>>>>> >> > AST<br>

>>>>>>>> >> > node).<br>

>>>>>>>> >> > This is consistent and people knowing this will never try to write<br>

>>>>>>>> >> > cxxConstructExpr(). As soon as people have used cxxRecordDecl(),<br>

>>>>>>>> >> > the<br>

>>>>>>>> >> > chance<br>

>>>>>>>> >> > of them trying cxxConstructExpr() increases. You have spent a long<br>

>>>>>>>> >> > time<br>

>>>>>>>> >> > figuring out that recordDecl means cxxRecordDecl(), which is one<br>

>>>>>>>> >> > datapoint,<br>

>>>>>>>> >> > but I am not aware of anyone else having this specific issue. And<br>

>>>>>>>> >> > we<br>

>>>>>>>> >> > could<br>

>>>>>>>> >> > make this less bad with better documentation, I think.<br>

>>>>>>>> >> ><br>

>>>>>>>> >> > So, for me, the questions are:<br>

>>>>>>>> >> > 1) Do we want/need this change?<br>

>>>>>>>> >><br>

>>>>>>>> >> We definitely need *a* change because there currently is no way to<br>

>>>>>>>> >> match a C struct or union when compiling in C mode. I discovered<br>

>>>>>>>> >> this<br>

>>>>>>>> >> because I was trying to write a new checker for clang-tidy that<br>

>>>>>>>> >> focuses on C code and it would fail to match when compiling in C<br>

>>>>>>>> >> mode.<br>

>>>>>>>> >> Whether we decide to go with cxxRecordDecl vs recordDecl vs<br>

>>>>>>>> >> structDecl<br>

>>>>>>>> >> (etc) is less important to me than the ability to write clang-tidy<br>

>>>>>>>> >> checks for C code.<br>

>>>>>>>> >><br>

>>>>>>>> >> > 2) Do we want to be consistent and change the other 27 matchers as<br>

>>>>>>>> >> > well?<br>

>>>>>>>> >><br>

>>>>>>>> >> I'm on the fence about this for all the reasons you point out.<br>

>>>>>>>> >><br>

>>>>>>>> >> > A fundamental question is whether we want AST matchers to match<br>

>>>>>>>> >> > AST<br>

>>>>>>>> >> > nodes<br>

>>>>>>>> >> > 1:1 or whether they should be an abstraction from some<br>

>>>>>>>> >> > implementation<br>

>>>>>>>> >> > details of the AST.<br>

>>>>>>>> >><br>

>>>>>>>> >> I absolutely agree that this is a fundamental question. I think a<br>

>>>>>>>> >> higher priority fundamental question that goes along with it is: are<br>

>>>>>>>> >> we okay with breaking a lot of user code (are these meant to be<br>

>>>>>>>> >> stable<br>

>>>>>>>> >> APIs like the LLVM C APIs)? If we want these APIs to be stable, that<br>

>>>>>>>> >> changes the answer of what kind of mapping we can have.<br>

>>>>>>>> ><br>

>>>>>>>> ><br>

>>>>>>>> > I think the AST matchers are so closely coupled to the AST that it<br>

>>>>>>>> > trying to<br>

>>>>>>>> > be more stable than the AST doesn't help. Basically all uses of AST<br>

>>>>>>>> > matchers<br>

>>>>>>>> > do something with the AST nodes afterwards, which will break anyway.<br>

>>>>>>>><br>

>>>>>>>> I can get behind that logic. So we're okay with breaking their code<br>

>>>>>>>> because there's no way around it -- it's tied to the AST, so users<br>

>>>>>>>> cannot rely on the AST APIs remaining the same from release to release<br>

>>>>>>>> anyway.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> We might even *want* the code to break, as the use of the AST node might<br>

>>>>>>> now be incorrect on a semantic level.<br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>><br>

>>>>>>>> ><br>

>>>>>>>> >><br>

>>>>>>>> >> > And this is not an easy question to answer. There are<br>

>>>>>>>> >> > many places where we don't follow a strict 1:1 mapping. Mostly<br>

>>>>>>>> >> > node<br>

>>>>>>>> >> > matchers, but also in traversal matchers, e.g. isDerivedFrom().<br>

>>>>>>>> >> ><br>

>>>>>>>> >> > Personally, I'd really hate to have the cxx Prefix everywhere, but<br>

>>>>>>>> >> > that's<br>

>>>>>>>> >> > just my personal opinion. I would even prefer matchers like<br>

>>>>>>>> >> > record() and<br>

>>>>>>>> >> > method(), but I think somebody convinced me that that would be a<br>

>>>>>>>> >> > very<br>

>>>>>>>> >> > bad<br>

>>>>>>>> >> > idea ;-).<br>

>>>>>>>> >><br>

>>>>>>>> >> My personal opinion is that (1) breaking code is fine, but we should<br>

>>>>>>>> >> avoid doing it without very clear benefit, and (2) the mapping<br>

>>>>>>>> >> between<br>

>>>>>>>> >> AST node identifiers and AST matcher identifiers needs to be<br>

>>>>>>>> >> incredibly obvious, but perhaps not slavishly 1:1. If we instead<br>

>>>>>>>> >> decide we want a 1:1 mapping, then I think we need to start<br>

>>>>>>>> >> seriously<br>

>>>>>>>> >> considering auto-generating the AST node (and type) matchers from<br>

>>>>>>>> >> tablegen so that the AST nodes *cannot* get out of sync with the AST<br>

>>>>>>>> >> matchers, otherwise we'll be right back here again in a few years<br>

>>>>>>>> >> when<br>

>>>>>>>> >> we modify the name of an AST node.<br>

>>>>>>>> ><br>

>>>>>>>> ><br>

>>>>>>>> > I do think we want to auto-generate the matchers, but I don't think<br>

>>>>>>>> > tablegen<br>

>>>>>>>> > is the right approach (I think an ast-matcher based tool is ;)<br>

>>>>>>>> > That said, auto-generating all the matchers is a) a lot of effort and<br>

>>>>>>>> > b) the<br>

>>>>>>>> > code-size and compile time of matchers already matters, so it's<br>

>>>>>>>> > unclear<br>

>>>>>>>> > which ones we would want to generate, especially for traversal<br>

>>>>>>>> > matchers :(<br>

>>>>>>>><br>

>>>>>>>> Oh, that's an excellent point (I'm talking about (b), I already knew<br>

>>>>>>>> (a) was a lot of work). Thank you for pointing that out!<br>

>>>>>>>><br>

>>>>>>>> ><br>

>>>>>>>> >><br>

>>>>>>>> >> My definition of "incredibly obvious" is: if the AST has a prefixed<br>

>>>>>>>> >> and unprefixed version, or two different prefixes, we should mimic<br>

>>>>>>>> >> that directly with the matchers. Otherwise, existing AST matchers<br>

>>>>>>>> >> without prefix shenanigans can remain as they are, and new AST<br>

>>>>>>>> >> matchers should prefix as-required. If we decide we're okay breaking<br>

>>>>>>>> >> code, then I don't see a problem with changing ctorInitializer()<br>

>>>>>>>> >> into<br>

>>>>>>>> >> cxxCtorInitializer() when C adds constructors. ;-)<br>

>>>>>>>> ><br>

>>>>>>>> ><br>

>>>>>>>> > I think the main things is cost for developers who try to write<br>

>>>>>>>> > matchers and<br>

>>>>>>>> > work from the -ast-dump. Figuring out that there *is* a matcher with<br>

>>>>>>>> > an<br>

>>>>>>>> > unprefixed node can take a while.<br>

>>>>>>>><br>

>>>>>>>> Hmm, yes, but "take a while" should be relatively short, I would<br>

>>>>>>>> think. In that use-case, the user does an -ast-dump, sees<br>

>>>>>>>> "CXXFrobbleGnasher", they go to the AST matcher reference and they<br>

>>>>>>>> search for "CXXFrobberGnasher." The first hit won't be<br>

>>>>>>>> cxxFrobbleGnasher, but the entry for frobbleGnasher (which is still<br>

>>>>>>>> the first hit when searching from the top of the document due to the<br>

>>>>>>>> way we position node matchers) will have a parameter of<br>

>>>>>>>> CXXFrobbleGnasher, so they will find still get to the right matcher on<br>

>>>>>>>> the first hit. If someone doesn't read the documentation at all,<br>

>>>>>>>> they're going to try cxxFrobbleGnasher() and get a compile error/no<br>

>>>>>>>> known matcher. Then they'll go look at ASTMatchers.h and figure out<br>

>>>>>>>> it's called frobbleGnasher by searching there instead of the<br>

>>>>>>>> documentation.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> The problem is that I've learned that sometimes people try to make<br>

>>>>>>> things work in ways that I couldn't even imagine, and they lose more time<br>

>>>>>>> than I could ever imagine them using :) Also, I agree the time is probably<br>

>>>>>>> on average not that large, but we pay it over a long time in the future, and<br>

>>>>>>> it tends to add up.<br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>> That's compared to having the matcher name always be the same as the<br>

>>>>>>>> AST node, where the user writes cxxFrobbleGnasher and it just works,<br>

>>>>>>>> which is definitely a mark in favor of making everything consistent. I<br>

>>>>>>>> just don't think the current approach is too onerous in the case where<br>

>>>>>>>> the matcher is at least *provided* for the user with a relatively sane<br>

>>>>>>>> name.<br>

>>>>>>>><br>

>>>>>>>> >> I should be clear, I'm not opposed to just having a 1:1 mapping. I'm<br>

>>>>>>>> >> just not certain the benefits of being strict about that outweigh<br>

>>>>>>>> >> the<br>

>>>>>>>> >> costs to broken code. cxxCtorInitializer will break someone's code,<br>

>>>>>>>> >> but I don't think it adds any clarity to their code, so I don't see<br>

>>>>>>>> >> the benefit of forcing the change.<br>

>>>>>>>> ><br>

>>>>>>>> > Well, I think there's the cost of broken code *once* now, vs. the<br>

>>>>>>>> > (smaller)<br>

>>>>>>>> > cost for users in all future.<br>

>>>>>>>> > I'm still strongly in favor of breaking now, and having a simpler<br>

>>>>>>>> > model<br>

>>>>>>>> > going forward.<br>

>>>>>>>><br>

>>>>>>>> I'm definitely in favor of breaking now in the case of RecordDecl vs<br>

>>>>>>>> CXXRecordDecl. I think having recordDecl match CXXRecordDecl is a bug<br>

>>>>>>>> given that there's no way to match a RecordDecl.<br>

>>>>>>>><br>

>>>>>>>> I would also be totally in favor of being consistent if we were<br>

>>>>>>>> starting from scratch. I'm very, very weakly opposed to breaking more<br>

>>>>>>>> user's code than we have to in order to get usable matchers because it<br>

>>>>>>>> seems gratuitous. Breaking code to get something that works seems<br>

>>>>>>>> reasonable. Breaking code that already works just to change the name<br>

>>>>>>>> for consistency elsewhere, I'm a bit less keen on. But the fact that<br>

>>>>>>>> we already can break user's code at-will because of the reliance on<br>

>>>>>>>> the AST nodes makes me think it may be the right approach for the best<br>

>>>>>>>> API, since that's what I would want if we were starting from scratch.<br>

>>>>>>>><br>

>>>>>>>> Okay, I'm convinced. I think we should rename the type and node<br>

>>>>>>>> matchers (not traversal and narrowing matchers) to match the AST node<br>

>>>>>>>> names in all cases. We can document the breakage in the release notes,<br>

>>>>>>>> but (hopefully) only have to do this dance one time instead of<br>

>>>>>>>> spreading the pain out as it happens to eventually get to the same<br>

>>>>>>>> place anyway.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> Yea, people who want more stability do use releases anyway.<br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>> Daniel, is this something you would be okay with? (I'm assuming<br>

>>>>>>>> Richard finds it acceptable based on previous comments from Manuel,<br>

>>>>>>>> but Richard, feel free to chime in.)<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> Offline conversation with Richard says that he is convinced.<br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>><br>

>>>>>>>> ~Aaron<br>

>>>>>><br>

>>>>>><br>

>>>><br>

</blockquote></div>