[cfe-commits] [PATCH] Change counter-intuitive behavior of isDerivedFrom

Sean Silva silvas at purdue.edu
Sun Sep 9 12:09:07 PDT 2012


This is getting confusing.

CXXRecordDecl::isDerivedFrom() is non-strict
<http://clang.llvm.org/doxygen/CXXInheritance_8cpp_source.html#l00089>.
Given that Clang's naming is intended to follow the standard as
closely as possible, maybe those methods will end up being renamed as
a result of this discussion too...

--Sean Silva

On Sat, Sep 8, 2012 at 1:44 PM, Daniel Jasper <djasper at google.com> wrote:
>
>
>
> On Sat, Sep 8, 2012 at 11:09 AM, Chandler Carruth <chandlerc at google.com>
> wrote:
>>
>> On Sat, Sep 8, 2012 at 1:37 AM, Daniel Jasper <djasper at google.com> wrote:
>>>
>>> On Fri, Sep 7, 2012 at 10:18 PM, Manuel Klimek <klimek at google.com> wrote:
>>>>
>>>> On Fri, Sep 7, 2012 at 9:55 PM, Chandler Carruth <chandlerc at google.com>
>>>> wrote:
>>>> > On Fri, Sep 7, 2012 at 8:06 AM, Daniel Jasper
>>>> > <reviews at llvm-reviews.chandlerc.com> wrote:
>>>> >>
>>>> >> Hi klimek,
>>>> >>
>>>> >> Change the behavior of the isDerivedFrom-matcher to not match on the
>>>> >> class
>>>> >> itself. This caused some confusion (intuitively, a class is not
>>>> >> derived from
>>>> >> itself) and makes it hard to write certain matchers, e.g. "match and
>>>> >> bind
>>>> >> any pair of base and subclass".
>>>> >>
>>>> >> The original behavior can be achieved with a new isA-matcher.
>>>> >> Similar to
>>>> >> all other matchers, this matcher has the same behavior and name as
>>>> >> the
>>>> >> corresponding AST-entity - in this case the isa<>() function.
>>>> >
>>>> >
>>>> > This is not a new idea. In fact, when we first built the isDerivedFrom
>>>> > matcher over a year ago Manuel and others argued for what you propose.
>>>> > I
>>>> > argued firmly against it, and my feelings have not changed. Manuel
>>>> > should
>>>> > have remembered that.
>>>>
>>>> I have remembered that; I also think the situation has changed - the
>>>> argument that isA would conflict with other names seems not to hold in
>>>> the light of the matchers reflecting functions that you would use with
>>>> other matchers - there is a duplication of domain naming inherent in
>>>> what the matchers are nowadays.
>>
>>
>> I can't really parse this sentence, but I don't think I agree with where
>> you're going here...
>>
>> But mostly I'm confused by this: the discussion we had involved more than
>> the name "isA". It also involved the semantics themselves. So no naming
>> change alone is sufficient.
>>
>>>
>>>
>>> Additionally, it is not easy to confuse the two, I think. First of, isA
>>> lives in clang::ast_matchers. So you would either have to work in that
>>> namespace or explicitly import it with using directives.
>>
>>
>> Which I expect roughly everyone to do. In fact, even if they don't the
>> namespaces won't be necessary in most cases except for the
>> inner-most-matcher.
>>
>>>
>>> Why is the similarity in name (with one letter being different in case)
>>> desirable for all other AST matchers, but not for isA? Is there an inherent
>>> difference between isA <-> isa and expr <-> Expr?
>>
>>
>> Yes, a massive one.
>>
>> The 'isa' template used throughout LLVM and Clang's implementation is an
>> implementation detail. It part of a more efficient RTTI system built around
>> the presupposition of total knowledge of all members of the type hierarchy.
>> See include/llvm/Support/Casting.h for the details here. It has absolutely
>> nothing to do with the AST, Clang, or the C++ code being parsed. It most
>> certainly has no relationship with establishing 'isa' relationships between
>> AST nodes of the parsed program, it is used to establish 'isa' relationships
>> between Clang's internal type system.
>>
>> On the other hand, the mirroring between the matchers and the AST nodes
>> are quite different. The matchers are functions, and the AST nodes are
>> universally types. The AST node type names would never be written in the
>> same context as a function realistically, and in fact the AST node names
>> themselves occur relatively infrequently. Most importantly, you *want* the
>> cognitive overloading between the two because they represent the same
>> construct. The matcher form matches the node form. There is no 'isa' node,
>> and so having an 'isA' matcher just makes no sense at all.
>>
>> I want to be clear: I view the discussion of whether "isA" is a good name
>> for a matcher as entirely orthogonal to the rest. I don't think that it is a
>> good name for any matcher under any circumstances, regardless of the
>> semantic model we end up with...
>>
>> Now, on to the semantic debate.
>>
>>>
>>>
>>>>
>>>> > See Sebastien Redl's comment for why I firmly believe that the current
>>>> > behavior is the correct behavior: std::is_base_of behaves the exact
>>>> > same
>>>> > way.
>>>
>>>
>>> This I did not know. I still don't think isDerivedFrom matching the class
>>> itself is intuitive at all, but this sheds some doubt on it. I firmly
>>> believe that neither behavior can be called "correct" as there is no
>>> definition of what correct is. There is std::is_base_of, but there is also
>>> what the standard says (e.g. "A class B is a base class of a class D if it
>>> is a direct base class of D or a direct base class of one of D’s base
>>> classes") and what people (not essentially working on a C++ compiler)
>>> intuitively think.
>>
>>
>> I think the exported interface of the standard library, which is clearly
>> meant to be consumed by a "normal" C++ programmer, not by standards experts
>> or compiler engineers exclusively, is the most likely to work long term.
>> These type traits are things that the type of C++ programmer we expect to
>> use matchers and tools will be intimately familiar with: strong C++
>> programmers working with large and/or complex libraries. Frankly, the Boost
>> people, or their counterparts here at Google.
>>
>> Another perspective is this: I would like to ensure that all of the C++11
>> standardized type trait predicates (like std::is_base_of, among others) have
>> analogous matchers because these type traits have already evolved over a
>> long period of people wanting to know these particular properties of C++
>> entities. They seems like perfect fodder for predicate matchers, in the same
>> way that the AST is the perfect fodder for node matchers.
>>
>> Now, that would argue for us eventually having an "isBaseOf" matcher,
>> which I certainly hope behaves exactly the same way std::is_base_of behaves.
>> To do anything else would be hopelessly confusing I think... And the
>> existence of such a production makes me *really* want isDerviedFrom to be
>> precisely symmetrical to isBaseOf. Anything else, I again think leads to
>> more confusion than the initial confusion of "oh, wow, isDerivedFrom matches
>> more than I expected". The latter is a confusion users will likely hit once
>> and never again because the consistency of the rule will make it easy to
>> remember. On the other hand, the confusion that results from divergent
>> semantics, between standardized predicates and matchers, or between two sets
>> of matchers, seems more likely to be an ongoing problem.
>>
>>>
>>>>
>>>> > I don't see why "math and bind any pair of base and subclass" is that
>>>> > hard
>>>> > to write. It only requires a not matcher that tests for type equality.
>>>> > Can
>>>> > you give an example that is hard to write, and maybe there is a better
>>>> > way
>>>> > to write it?
>>>>
>>>> You cannot test for type equality unless you know the type of what
>>>> you're looking for.
>>>
>>>
>>> To give more concrete examples. I would like to match any parent->direct
>>> base relationship.
>>
>>
>> The first time around, we also came up with piles of examples. Most of the
>> obvious uses for isDerivedFrom we came up with actually wanted the
>> same-class-works behavior. When we wrote the GWS refactoring, we found
>> multiple places within the code where we implemented the exact logic of
>> isDerived From. I'm not saying your example is bad or invalid, I like it a
>> lot. I just don't want to neglect the use cases which actually wanted the
>> old behavior. I think *both* tasks are valid.
>>
>>>
>>> With the new matcher, I could do:
>>>
>>>   recordDecl(isDerivedFrom(recordDecl.bind("base"))).bind("derived")
>>>
>>> With the old behavior, I can't implement this as part of the matcher
>>> language. And I think there is a more general case here. Basically, you
>>> can't match base classes if the derived class might also have the matched
>>> property.
>>
>>
>> Yes, this is the core problem, and it does indeed lead inevitably to this
>> conclusion:
>>
>>>
>>> #1 just means we need the two matchers, and I think nobody is against
>>> that.
>>
>>
>> Correct, we need two matchers. None of what I have said argues against
>> supporting the use case you have in mind.
>>
>>>
>>> #2 basically concerns the names of the matchers. I am open to discussion,
>>> but I quite strongly feel that this is the most intuitive interface.
>>
>>
>> I think there is ample evidence that at least *some* people do not find
>> this to be intuitive. Lets assume for a moment (and discuss above if there
>> is serious disagreement) that "isA" is not a viable name for either of these
>> matchers.
>>
>> What seems more *consistent* even when it is not intuitive would be:
>>
>> isDerivedFrom -- works the way std::is_base_of does, includes self.
>> isStrictlyDerivedFrom -- does not include self. uses the word "strictly"
>> in the same sense that set arithmetic uses it: isDerivedFrom is a superset
>> test, and isStrictlyDerivedFrom is a strict superset test.
>>
>> This seems consistent with the standard library, and any mirroring
>> matchers we ever get. It seems to use nice distinct names. It seems to be
>> consistent and reasonably easy to explain. It may be surprising the first
>> time you read the description of the matcher, but is this likely to cause
>> long-term confusion?
>>
>> Also, maybe there are still better names that we should consider. It'd be
>> worth getting further input from the community there...
>
>
> Ok, so there are two arguments:
> 1) We don't want to reuse "isa".
> 2) The "intuitive" behavior of "isDerivedFrom" is debatable.
>
> As for #1, I don't see it as such a big problem, as I don't see users
> confusing them. However, you are much more experienced within the clang
> sources and I am happy to accept your judgement. However, we first need to
> solve #2, because that decides whether we need to find an alternative name
> for isA or for the new isDerivedFrom.
>
> As for #2, I think the old behavior of isDerivedFrom is not intuitive from
> the natural language point of view and we sort of try to make
> matcher-constructs form natural sentences. However, I see the argument that
> we should follow the established behavior of is_base_of. On the other hand:
> If you go back to one of the examples where you used isDerivedFrom with its
> old behavior, does the matcher really form an intuitively understandable
> sentence? I tried to find good examples, but couldn't. I think in those
> cases you actually don't care about class inheritance, you just want to find
> all classes that "are"/"can act as" a certain class.
>
> So, we need to decide #2 and only after that we need to argue about names. I
> can see both sides of the argument and am torn. I still lean a bit towards
> the without-itself-behavior.
>
> Regarding your naming suggestions:
> I still think this naming will give us some headaches. Lets assume we will
> one day need more matchers (experience suggests we will ;-)). Lets assume we
> need a matcher to only match the direct base class. Both natural language
> and the terms used in the standard suggest that this should be called
> "isDirectlyDerivedFrom" or "isDirectBaseOf". Then we would have
> isDirectlyDerivedFrom, isStrictlyDerivedFrom and isDerivedFrom. They have a
> valid superset relationship (isDirectlyDerivedFrom < isStrictlyDerivedFrom <
> isDerivedFrom), but e.g. isDirectlyDerivedFrom and isStrictlyDerivedFrom
> would be easily confused. I actually think isDerivedFromButNot might be an
> ugly but less confusing alternative.
>
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>




More information about the cfe-commits mailing list