[cfe-dev] Canonical representation of declaration names

Doug Gregor doug.gregor at gmail.com
Thu Nov 13 10:00:29 PST 2008


Currently in Clang, we have several different wants of representing
the name of a declaration:
  - Most of NamedDecl's descendants use a simple IdentifierInfo*
  - ObjCMethodDecl (which is *not* a NamedDecl) uses a Selector, which
is an optimized representation of Objective-C selectors that is a
masked IdentifierInfo* in common cases (0 or 1 argument selectors) and
a private MultiKeywordSelector* in other cases (>= 2 arguments).
  - C++ constructors, destructors, and conversion functions have a
dummy IdentifierInfo* (with identifiers like "<constructor>" and
"<conversion>"), then override NamedDecl::getName() to provide
human-readable names for these entities.

At Chris's suggestion, the attached patch introduces a new
class---DeclarationName---that serves as a name for a declaration
within the AST. It can represent normal identifiers, C++ special
names, and Objective-C selectors.

Where is DeclarationName Used?
========================
DeclarationName is the type of the name stored in NamedDecl. Most
clients won't see much of a difference here, because an
IdentifierInfo* is implicitly convertible to a DeclarationName, and
those *Decl nodes that can only have normal identifiers for names will
still accept an IdentifierInfo*.
*Decl nodes with special names---C++ constructors, destructors, and
conversion functions--- will take in a DeclarationName directly, which
will encode the kind of function and additional information needed to
identify the name (e.g., the type of a conversion function). In
addition, ObjCMethodDecl is now a NamedDecl, and its selector will be
stored in NamedDecl's DeclarationName. Conversion functions between
Selector and DeclarationName make this change mainly an implementation
detail

Note that NamedDecl now has a few more ways to access the name. We can
still ask for an IdentifierInfo* (which will be NULL for non-normal
identifiers). However, getName() now returns a std::string (which may
be built on-the-fly) and there is a new getDeclName() to get the
underlying DeclarationName, for clients that might need it.

Efficiency Concerns
==============
DeclarationName is the size of a single pointer. The lower two bits
are used to determine what kind of name this is. We optimize for three
common cases: an IdentifierInfo*, a 0-argument selector, and a
1-argument selector, and in each case the pointer is just a masked
IdentifierInfo*.

In the less-common case, the pointer is a masked
DeclarationNameExtra*, where the DeclarationNameExpr can either be a
multi-argument Objective-C selector (MultiKeywordSelector) or one of
the C++ special names (constructor, destructor, conversion function).
All of these cases need extra storage anyway.

The encoded integer in DeclarationName is a unique value which can be
efficiently compared for equality. In the case of Objective-C
selectors and the C++ special functions, the DeclarationNameExtra
pointer is uniqued in a separate table (SelectorTable and
DeclarationNameTable, respectively).

Overall, DeclarationName should be as efficient as an IdentifierInfo*
in both space and time, and as efficient as a Selector for Objective-C
methods. DeclarationName is better than the current hack involving
creating special identifiers for constructors, destructors, and
conversion functions, and will make future work in the area of
conversion functions easier (see below).

Known Issues
==========
There's a small layering violation in DeclarationName. DeclarationName
lives in the AST, because that's where it makes sense to talk about
the name of a declaration as a more abstract entity. However,
DeclarationNameExtras and Selector live in Basic, because Selector is
used in the Parse-Sema interaction and Selector's internal
implementation depends on DeclarationNameExtras.

We could move DeclarationName into Basic and make the QualType stored
by a C++ constructor/destructor/conversion function into an opaque
type pointer, but that seems wrong somehow. Maybe the real issue is
that Selector doesn't belong in Basic.

Future Work
=========
Right now, IdentifierResolver still traffics in IdentifierInfo
pointers, and declarations with non-identifier names are never pushed
into a scope. That may change with conversion functions, in which case
the IdentifierResolver will need to work with DeclarationNames. This
should only require a small tweak, so that we have a FETokenInfo field
that is accessible through a DeclarationName. It would, of course, not
be stored in DeclarationName itself but in its IdentifierInfo*
(FETokenInfo is already there) or its DeclarationNameExtras* (we would
need to add it here).

Comments and questions welcome!

  - Doug
-------------- next part --------------
A non-text attachment was scrubbed...
Name: declaration-name.patch
Type: text/x-patch
Size: 75370 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20081113/ce50198c/attachment.bin>


More information about the cfe-dev mailing list