[cfe-commits] [PATCH] Change counter-intuitive behavior of isDerivedFrom

Daniel Jasper djasper at google.com
Mon Sep 10 00:25:50 PDT 2012


If I read it right, Sean actually meant that the currently implemented
CXXRecordDecl::isDerivedFrom() is strict (does not include the base
itself). This, I think, is a VERY compelling argument to make the
ASTMatcher that matches on a CXXRecordDecl have the same behavior.

As for standard conformity: The standard does not really say anything about
this. It contains two aspects:
- Chapter 10 reads like a class is not derived from itself, e.g. "A class B
is a base class of a class D if it is a direct base class of D or a direct
base class of one of D’s base classes".
- It defines is_base_of<Base, Derived> as "Base is a base class of Derived
... or Base and Derived are ... the same class type ...". This can be
counted to either side of the argument. On the one hand, it clearly
separates the case that a class actually is not a base of itself. On the
other hand, it sets a precedent for a function is_base_of implementing the
non-strict behavior.




On Sun, Sep 9, 2012 at 9:09 PM, Sean Silva <silvas at purdue.edu> wrote:

> This is getting confusing.
>
> CXXRecordDecl::isDerivedFrom() is non-strict
> <http://clang.llvm.org/doxygen/CXXInheritance_8cpp_source.html#l00089>.
> Given that Clang's naming is intended to follow the standard as
> closely as possible, maybe those methods will end up being renamed as
> a result of this discussion too...
>
> --Sean Silva
>
> On Sat, Sep 8, 2012 at 1:44 PM, Daniel Jasper <djasper at google.com> wrote:
> >
> >
> >
> > On Sat, Sep 8, 2012 at 11:09 AM, Chandler Carruth <chandlerc at google.com>
> > wrote:
> >>
> >> On Sat, Sep 8, 2012 at 1:37 AM, Daniel Jasper <djasper at google.com>
> wrote:
> >>>
> >>> On Fri, Sep 7, 2012 at 10:18 PM, Manuel Klimek <klimek at google.com>
> wrote:
> >>>>
> >>>> On Fri, Sep 7, 2012 at 9:55 PM, Chandler Carruth <
> chandlerc at google.com>
> >>>> wrote:
> >>>> > On Fri, Sep 7, 2012 at 8:06 AM, Daniel Jasper
> >>>> > <reviews at llvm-reviews.chandlerc.com> wrote:
> >>>> >>
> >>>> >> Hi klimek,
> >>>> >>
> >>>> >> Change the behavior of the isDerivedFrom-matcher to not match on
> the
> >>>> >> class
> >>>> >> itself. This caused some confusion (intuitively, a class is not
> >>>> >> derived from
> >>>> >> itself) and makes it hard to write certain matchers, e.g. "match
> and
> >>>> >> bind
> >>>> >> any pair of base and subclass".
> >>>> >>
> >>>> >> The original behavior can be achieved with a new isA-matcher.
> >>>> >> Similar to
> >>>> >> all other matchers, this matcher has the same behavior and name as
> >>>> >> the
> >>>> >> corresponding AST-entity - in this case the isa<>() function.
> >>>> >
> >>>> >
> >>>> > This is not a new idea. In fact, when we first built the
> isDerivedFrom
> >>>> > matcher over a year ago Manuel and others argued for what you
> propose.
> >>>> > I
> >>>> > argued firmly against it, and my feelings have not changed. Manuel
> >>>> > should
> >>>> > have remembered that.
> >>>>
> >>>> I have remembered that; I also think the situation has changed - the
> >>>> argument that isA would conflict with other names seems not to hold in
> >>>> the light of the matchers reflecting functions that you would use with
> >>>> other matchers - there is a duplication of domain naming inherent in
> >>>> what the matchers are nowadays.
> >>
> >>
> >> I can't really parse this sentence, but I don't think I agree with where
> >> you're going here...
> >>
> >> But mostly I'm confused by this: the discussion we had involved more
> than
> >> the name "isA". It also involved the semantics themselves. So no naming
> >> change alone is sufficient.
> >>
> >>>
> >>>
> >>> Additionally, it is not easy to confuse the two, I think. First of, isA
> >>> lives in clang::ast_matchers. So you would either have to work in that
> >>> namespace or explicitly import it with using directives.
> >>
> >>
> >> Which I expect roughly everyone to do. In fact, even if they don't the
> >> namespaces won't be necessary in most cases except for the
> >> inner-most-matcher.
> >>
> >>>
> >>> Why is the similarity in name (with one letter being different in case)
> >>> desirable for all other AST matchers, but not for isA? Is there an
> inherent
> >>> difference between isA <-> isa and expr <-> Expr?
> >>
> >>
> >> Yes, a massive one.
> >>
> >> The 'isa' template used throughout LLVM and Clang's implementation is an
> >> implementation detail. It part of a more efficient RTTI system built
> around
> >> the presupposition of total knowledge of all members of the type
> hierarchy.
> >> See include/llvm/Support/Casting.h for the details here. It has
> absolutely
> >> nothing to do with the AST, Clang, or the C++ code being parsed. It most
> >> certainly has no relationship with establishing 'isa' relationships
> between
> >> AST nodes of the parsed program, it is used to establish 'isa'
> relationships
> >> between Clang's internal type system.
> >>
> >> On the other hand, the mirroring between the matchers and the AST nodes
> >> are quite different. The matchers are functions, and the AST nodes are
> >> universally types. The AST node type names would never be written in the
> >> same context as a function realistically, and in fact the AST node names
> >> themselves occur relatively infrequently. Most importantly, you *want*
> the
> >> cognitive overloading between the two because they represent the same
> >> construct. The matcher form matches the node form. There is no 'isa'
> node,
> >> and so having an 'isA' matcher just makes no sense at all.
> >>
> >> I want to be clear: I view the discussion of whether "isA" is a good
> name
> >> for a matcher as entirely orthogonal to the rest. I don't think that it
> is a
> >> good name for any matcher under any circumstances, regardless of the
> >> semantic model we end up with...
> >>
> >> Now, on to the semantic debate.
> >>
> >>>
> >>>
> >>>>
> >>>> > See Sebastien Redl's comment for why I firmly believe that the
> current
> >>>> > behavior is the correct behavior: std::is_base_of behaves the exact
> >>>> > same
> >>>> > way.
> >>>
> >>>
> >>> This I did not know. I still don't think isDerivedFrom matching the
> class
> >>> itself is intuitive at all, but this sheds some doubt on it. I firmly
> >>> believe that neither behavior can be called "correct" as there is no
> >>> definition of what correct is. There is std::is_base_of, but there is
> also
> >>> what the standard says (e.g. "A class B is a base class of a class D
> if it
> >>> is a direct base class of D or a direct base class of one of D’s base
> >>> classes") and what people (not essentially working on a C++ compiler)
> >>> intuitively think.
> >>
> >>
> >> I think the exported interface of the standard library, which is clearly
> >> meant to be consumed by a "normal" C++ programmer, not by standards
> experts
> >> or compiler engineers exclusively, is the most likely to work long term.
> >> These type traits are things that the type of C++ programmer we expect
> to
> >> use matchers and tools will be intimately familiar with: strong C++
> >> programmers working with large and/or complex libraries. Frankly, the
> Boost
> >> people, or their counterparts here at Google.
> >>
> >> Another perspective is this: I would like to ensure that all of the
> C++11
> >> standardized type trait predicates (like std::is_base_of, among others)
> have
> >> analogous matchers because these type traits have already evolved over a
> >> long period of people wanting to know these particular properties of C++
> >> entities. They seems like perfect fodder for predicate matchers, in the
> same
> >> way that the AST is the perfect fodder for node matchers.
> >>
> >> Now, that would argue for us eventually having an "isBaseOf" matcher,
> >> which I certainly hope behaves exactly the same way std::is_base_of
> behaves.
> >> To do anything else would be hopelessly confusing I think... And the
> >> existence of such a production makes me *really* want isDerviedFrom to
> be
> >> precisely symmetrical to isBaseOf. Anything else, I again think leads to
> >> more confusion than the initial confusion of "oh, wow, isDerivedFrom
> matches
> >> more than I expected". The latter is a confusion users will likely hit
> once
> >> and never again because the consistency of the rule will make it easy to
> >> remember. On the other hand, the confusion that results from divergent
> >> semantics, between standardized predicates and matchers, or between two
> sets
> >> of matchers, seems more likely to be an ongoing problem.
> >>
> >>>
> >>>>
> >>>> > I don't see why "math and bind any pair of base and subclass" is
> that
> >>>> > hard
> >>>> > to write. It only requires a not matcher that tests for type
> equality.
> >>>> > Can
> >>>> > you give an example that is hard to write, and maybe there is a
> better
> >>>> > way
> >>>> > to write it?
> >>>>
> >>>> You cannot test for type equality unless you know the type of what
> >>>> you're looking for.
> >>>
> >>>
> >>> To give more concrete examples. I would like to match any
> parent->direct
> >>> base relationship.
> >>
> >>
> >> The first time around, we also came up with piles of examples. Most of
> the
> >> obvious uses for isDerivedFrom we came up with actually wanted the
> >> same-class-works behavior. When we wrote the GWS refactoring, we found
> >> multiple places within the code where we implemented the exact logic of
> >> isDerived From. I'm not saying your example is bad or invalid, I like
> it a
> >> lot. I just don't want to neglect the use cases which actually wanted
> the
> >> old behavior. I think *both* tasks are valid.
> >>
> >>>
> >>> With the new matcher, I could do:
> >>>
> >>>   recordDecl(isDerivedFrom(recordDecl.bind("base"))).bind("derived")
> >>>
> >>> With the old behavior, I can't implement this as part of the matcher
> >>> language. And I think there is a more general case here. Basically, you
> >>> can't match base classes if the derived class might also have the
> matched
> >>> property.
> >>
> >>
> >> Yes, this is the core problem, and it does indeed lead inevitably to
> this
> >> conclusion:
> >>
> >>>
> >>> #1 just means we need the two matchers, and I think nobody is against
> >>> that.
> >>
> >>
> >> Correct, we need two matchers. None of what I have said argues against
> >> supporting the use case you have in mind.
> >>
> >>>
> >>> #2 basically concerns the names of the matchers. I am open to
> discussion,
> >>> but I quite strongly feel that this is the most intuitive interface.
> >>
> >>
> >> I think there is ample evidence that at least *some* people do not find
> >> this to be intuitive. Lets assume for a moment (and discuss above if
> there
> >> is serious disagreement) that "isA" is not a viable name for either of
> these
> >> matchers.
> >>
> >> What seems more *consistent* even when it is not intuitive would be:
> >>
> >> isDerivedFrom -- works the way std::is_base_of does, includes self.
> >> isStrictlyDerivedFrom -- does not include self. uses the word "strictly"
> >> in the same sense that set arithmetic uses it: isDerivedFrom is a
> superset
> >> test, and isStrictlyDerivedFrom is a strict superset test.
> >>
> >> This seems consistent with the standard library, and any mirroring
> >> matchers we ever get. It seems to use nice distinct names. It seems to
> be
> >> consistent and reasonably easy to explain. It may be surprising the
> first
> >> time you read the description of the matcher, but is this likely to
> cause
> >> long-term confusion?
> >>
> >> Also, maybe there are still better names that we should consider. It'd
> be
> >> worth getting further input from the community there...
> >
> >
> > Ok, so there are two arguments:
> > 1) We don't want to reuse "isa".
> > 2) The "intuitive" behavior of "isDerivedFrom" is debatable.
> >
> > As for #1, I don't see it as such a big problem, as I don't see users
> > confusing them. However, you are much more experienced within the clang
> > sources and I am happy to accept your judgement. However, we first need
> to
> > solve #2, because that decides whether we need to find an alternative
> name
> > for isA or for the new isDerivedFrom.
> >
> > As for #2, I think the old behavior of isDerivedFrom is not intuitive
> from
> > the natural language point of view and we sort of try to make
> > matcher-constructs form natural sentences. However, I see the argument
> that
> > we should follow the established behavior of is_base_of. On the other
> hand:
> > If you go back to one of the examples where you used isDerivedFrom with
> its
> > old behavior, does the matcher really form an intuitively understandable
> > sentence? I tried to find good examples, but couldn't. I think in those
> > cases you actually don't care about class inheritance, you just want to
> find
> > all classes that "are"/"can act as" a certain class.
> >
> > So, we need to decide #2 and only after that we need to argue about
> names. I
> > can see both sides of the argument and am torn. I still lean a bit
> towards
> > the without-itself-behavior.
> >
> > Regarding your naming suggestions:
> > I still think this naming will give us some headaches. Lets assume we
> will
> > one day need more matchers (experience suggests we will ;-)). Lets
> assume we
> > need a matcher to only match the direct base class. Both natural language
> > and the terms used in the standard suggest that this should be called
> > "isDirectlyDerivedFrom" or "isDirectBaseOf". Then we would have
> > isDirectlyDerivedFrom, isStrictlyDerivedFrom and isDerivedFrom. They
> have a
> > valid superset relationship (isDirectlyDerivedFrom <
> isStrictlyDerivedFrom <
> > isDerivedFrom), but e.g. isDirectlyDerivedFrom and isStrictlyDerivedFrom
> > would be easily confused. I actually think isDerivedFromButNot might be
> an
> > ugly but less confusing alternative.
> >
> >
> >
> > _______________________________________________
> > cfe-commits mailing list
> > cfe-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120910/3e1469d8/attachment.html>


More information about the cfe-commits mailing list