<br><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Sep 8, 2012 at 11:09 AM, Chandler Carruth <span dir="ltr"><<a href="mailto:chandlerc@google.com" target="_blank" class="cremed">chandlerc@google.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_extra"><div class="gmail_quote"><div class="im">On Sat, Sep 8, 2012 at 1:37 AM, Daniel Jasper <span dir="ltr"><<a href="mailto:djasper@google.com" target="_blank" class="cremed">djasper@google.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_extra"><div class="gmail_quote"><div>On Fri, Sep 7, 2012 at 10:18 PM, Manuel Klimek <span dir="ltr"><<a href="mailto:klimek@google.com" target="_blank" class="cremed">klimek@google.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On Fri, Sep 7, 2012 at 9:55 PM, Chandler Carruth <<a href="mailto:chandlerc@google.com" target="_blank" class="cremed">chandlerc@google.com</a>> wrote:<br>
> On Fri, Sep 7, 2012 at 8:06 AM, Daniel Jasper<br>
> <<a href="mailto:reviews@llvm-reviews.chandlerc.com" target="_blank" class="cremed">reviews@llvm-reviews.chandlerc.com</a>> wrote:<br>
>><br>
>> Hi klimek,<br>
>><br>
>> Change the behavior of the isDerivedFrom-matcher to not match on the class<br>
>> itself. This caused some confusion (intuitively, a class is not derived from<br>
>> itself) and makes it hard to write certain matchers, e.g. "match and bind<br>
>> any pair of base and subclass".<br>
>><br>
>> The original behavior can be achieved with a new isA-matcher. Similar to<br>
>> all other matchers, this matcher has the same behavior and name as the<br>
>> corresponding AST-entity - in this case the isa<>() function.<br>
><br>
><br>
> This is not a new idea. In fact, when we first built the isDerivedFrom<br>
> matcher over a year ago Manuel and others argued for what you propose. I<br>
> argued firmly against it, and my feelings have not changed. Manuel should<br>
> have remembered that.<br>
<br>
</div>I have remembered that; I also think the situation has changed - the<br>
argument that isA would conflict with other names seems not to hold in<br>
the light of the matchers reflecting functions that you would use with<br>
other matchers - there is a duplication of domain naming inherent in<br>
what the matchers are nowadays.<br></blockquote></div></div></div></blockquote><div><br></div></div><div>I can't really parse this sentence, but I don't think I agree with where you're going here...</div><div>
<br></div>
<div>But mostly I'm confused by this: the discussion we had involved more than the name "isA". It also involved the semantics themselves. So no naming change alone is sufficient.<br></div><div class="im"><div>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="gmail_extra"><div class="gmail_quote"><div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"></blockquote><div><br></div></div><div>Additionally, it is not easy to confuse the two, I think. First of, isA lives in clang::ast_matchers. So you would either have to work in that namespace or explicitly import it with using directives.</div>
</div></div></blockquote><div><br></div></div><div>Which I expect roughly everyone to do. In fact, even if they don't the namespaces won't be necessary in most cases except for the inner-most-matcher.</div><div class="im">
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="gmail_extra"><div class="gmail_quote"><div>Why is the similarity in name (with one letter being different in case) desirable for all other AST matchers, but not for isA? Is there an inherent difference between isA <-> isa and expr <-> Expr?</div>
</div></div></blockquote><div><br></div></div><div>Yes, a massive one.</div><div><br></div><div>The 'isa' template used throughout LLVM and Clang's implementation is an implementation detail. It part of a more efficient RTTI system built around the presupposition of total knowledge of all members of the type hierarchy. See include/llvm/Support/Casting.h for the details here. It has absolutely nothing to do with the AST, Clang, or the C++ code being parsed. It most certainly has no relationship with establishing 'isa' relationships between AST nodes of the parsed program, it is used to establish 'isa' relationships between Clang's internal type system.</div>
<div><br>On the other hand, the mirroring between the matchers and the AST nodes are quite different. The matchers are functions, and the AST nodes are universally types. The AST node type names would never be written in the same context as a function realistically, and in fact the AST node names themselves occur relatively infrequently. Most importantly, you *want* the cognitive overloading between the two because they represent the same construct. The matcher form matches the node form. There is no 'isa' node, and so having an 'isA' matcher just makes no sense at all.</div>
<div><br></div><div>I want to be clear: I view the discussion of whether "isA" is a good name for a matcher as entirely orthogonal to the rest. I don't think that it is a good name for any matcher under any circumstances, regardless of the semantic model we end up with...</div>
<div><br></div><div>Now, on to the semantic debate.</div><div class="im"><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_extra"><div class="gmail_quote">
<div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>
> See Sebastien Redl's comment for why I firmly believe that the current<br>
> behavior is the correct behavior: std::is_base_of behaves the exact same<br>
> way.<br></div></blockquote><div><br></div></div><div>This I did not know. I still don't think isDerivedFrom matching the class itself is intuitive at all, but this sheds some doubt on it. I firmly believe that neither behavior can be called "correct" as there is no definition of what correct is. There is std::is_base_of, but there is also what the standard says (e.g. "A class B is a base class of a class D if it is a direct base class of D or a direct base class of one of D’s base classes") and what people (not essentially working on a C++ compiler) intuitively think.</div>
</div></div></blockquote><div><br></div></div><div>I think the exported interface of the standard library, which is clearly meant to be consumed by a "normal" C++ programmer, not by standards experts or compiler engineers exclusively, is the most likely to work long term. These type traits are things that the type of C++ programmer we expect to use matchers and tools will be intimately familiar with: strong C++ programmers working with large and/or complex libraries. Frankly, the Boost people, or their counterparts here at Google.</div>
<div><br></div><div>Another perspective is this: I would like to ensure that all of the C++11 standardized type trait predicates (like std::is_base_of, among others) have analogous matchers because these type traits have already evolved over a long period of people wanting to know these particular properties of C++ entities. They seems like perfect fodder for predicate matchers, in the same way that the AST is the perfect fodder for node matchers.</div>
<div><br></div><div>Now, that would argue for us eventually having an "isBaseOf" matcher, which I certainly hope behaves exactly the same way std::is_base_of behaves. To do anything else would be hopelessly confusing I think... And the existence of such a production makes me *really* want isDerviedFrom to be precisely symmetrical to isBaseOf. Anything else, I again think leads to more confusion than the initial confusion of "oh, wow, isDerivedFrom matches more than I expected". The latter is a confusion users will likely hit once and never again because the consistency of the rule will make it easy to remember. On the other hand, the confusion that results from divergent semantics, between standardized predicates and matchers, or between two sets of matchers, seems more likely to be an ongoing problem.</div>
<div class="im">
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_extra"><div class="gmail_quote"><div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>
> I don't see why "math and bind any pair of base and subclass" is that hard<br>
> to write. It only requires a not matcher that tests for type equality. Can<br>
> you give an example that is hard to write, and maybe there is a better way<br>
> to write it?<br>
<br>
</div>You cannot test for type equality unless you know the type of what<br>
you're looking for.<br></blockquote><div><br></div></div><div>To give more concrete examples. I would like to match any parent->direct base relationship.</div></div></div></blockquote><div><br></div></div><div>The first time around, we also came up with piles of examples. Most of the obvious uses for isDerivedFrom we came up with actually wanted the same-class-works behavior. When we wrote the GWS refactoring, we found multiple places within the code where we implemented the exact logic of isDerived From. I'm not saying your example is bad or invalid, I like it a lot. I just don't want to neglect the use cases which actually wanted the old behavior. I think *both* tasks are valid.</div>
<div class="im">
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_extra"><div class="gmail_quote"><div>With the new matcher, I could do:</div><div><br></div>
<div> recordDecl(isDerivedFrom(recordDecl.bind("base"))).bind("derived")</div>
<div><br></div><div>With the old behavior, I can't implement this as part of the matcher language. And I think there is a more general case here. Basically, you can't match base classes if the derived class might also have the matched property.</div>
</div></div></blockquote><div><br></div></div><div>Yes, this is the core problem, and it does indeed lead inevitably to this conclusion:</div><div class="im"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="gmail_extra"><div class="gmail_quote"><div>#1 just means we need the two matchers, and I think nobody is against that.<br></div></div></div></blockquote><div><br></div></div><div>Correct, we need two matchers. None of what I have said argues against supporting the use case you have in mind.</div>
<div class="im">
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_extra"><div class="gmail_quote"><div></div>
<div>#2 basically concerns the names of the matchers. I am open to discussion, but I quite strongly feel that this is the most intuitive interface.</div></div></div></blockquote><div><br></div></div><div>I think there is ample evidence that at least *some* people do not find this to be intuitive. Lets assume for a moment (and discuss above if there is serious disagreement) that "isA" is not a viable name for either of these matchers.</div>
<div><br></div><div>What seems more *consistent* even when it is not intuitive would be:</div><div><br></div><div>isDerivedFrom -- works the way std::is_base_of does, includes self.</div><div>isStrictlyDerivedFrom -- does not include self. uses the word "strictly" in the same sense that set arithmetic uses it: isDerivedFrom is a superset test, and isStrictlyDerivedFrom is a strict superset test.</div>
<div><br></div><div>This seems consistent with the standard library, and any mirroring matchers we ever get. It seems to use nice distinct names. It seems to be consistent and reasonably easy to explain. It may be surprising the first time you read the description of the matcher, but is this likely to cause long-term confusion?</div>
<div><br></div><div>Also, maybe there are still better names that we should consider. It'd be worth getting further input from the community there...</div></div></div>
</blockquote></div><br></div><div class="gmail_extra">Ok, so there are two arguments:</div><div class="gmail_extra">1) We don't want to reuse "isa".</div><div class="gmail_extra">2) The "intuitive" behavior of "isDerivedFrom" is debatable.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">As for #1, I don't see it as such a big problem, as I don't see users confusing them. However, you are much more experienced within the clang sources and I am happy to accept your judgement. However, we first need to solve #2, because that decides whether we need to find an alternative name for isA or for the new isDerivedFrom.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">As for #2, I think the old behavior of isDerivedFrom is not intuitive from the natural language point of view and we sort of try to make matcher-constructs form natural sentences. However, I see the argument that we should follow the established behavior of is_base_of. On the other hand: If you go back to one of the examples where you used isDerivedFrom with its old behavior, does the matcher really form an intuitively understandable sentence? I tried to find good examples, but couldn't. I think in those cases you actually don't care about class inheritance, you just want to find all classes that "are"/"can act as" a certain class.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">So, we need to decide #2 and only after that we need to argue about names. I can see both sides of the argument and am torn. I still lean a bit towards the without-itself-behavior. </div>
<div class="gmail_extra"><br></div><div class="gmail_extra">Regarding your naming suggestions:</div><div class="gmail_extra">I still think this naming will give us some headaches. Lets assume we will one day need more matchers (experience suggests we will ;-)). Lets assume we need a matcher to only match the direct base class. Both natural language and the terms used in the standard suggest that this should be called "isDirectlyDerivedFrom" or "isDirectBaseOf". Then we would have isDirectlyDerivedFrom, isStrictlyDerivedFrom and isDerivedFrom. They have a valid superset relationship (isDirectlyDerivedFrom < isStrictlyDerivedFrom < isDerivedFrom), but e.g. isDirectlyDerivedFrom and isStrictlyDerivedFrom would be easily confused. I actually think isDerivedFromButNot might be an ugly but less confusing alternative.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra"><br></div>