[cfe-dev] Qualified and unqualified name lookup and IdResolver

Mon Dec 8 11:44:59 PST 2008

On Dec 4, 2008, at 5:30 PM, Douglas Gregor wrote:
> The attached patch changes how we perform both qualified and  
> unqualified name lookup in Clang. It represents a significant  
> departure from the approach currently implemented in Clang, but I  
> believe it will have performance and maintainability benefits in the  
> future.

Hi Doug,

I think this is a great step forward.

> This patch is neither complete nor fully-optimized. It should not  
> degrade performance in C, because the array/hash table data  
> structures are not built in C. However, there are more optimization  
> opportunities:

I think that this patch should only go in if you're committed to doing  
these improvements.  I'm somewhat scared that C and C++ will diverge  
even more.  If you're planning to do these, then I'm all for the change!

>
> 	- Once we've performed unqualified name lookup for a name, we  
> should cache it on the chain of declarations. For example, in N::g,  
> we do some work to find N::x. We should add N::x to the declaration  
> chain---but not the DeclContext!---so that subsequent lookups of x  
> are very fast (since it'll be at the front of the declaration list  
> for "x").

Nice.

>
> 	- We should be building the array/hash table in the DeclContexts  
> lazily, when we need to answer the question "is there an x in N?".  
> This would allow us to remove the special-casing for C.

Yes, this would be great.

>
> 	- We should make RecordDecl derive from DeclContext, and replace  
> the current storage and name lookup for fields  
> (RecordDecl::getMember, a linear search) with lookup into the  
> DeclContext. This will give us faster lookup for records with many  
> members. A similar thing could be done for Objective-C interfaces/ 
> implementations.

Yep, we should migrate ObjC as well.  I think it is important that we  
keep it and C and C++ all consistent where possible.

>
> 	- When variables are redeclared (e.g., at translation unit scope),  
> we should be replacing the current declaration with the previous  
> one, rather than pushing both declarations onto the declaration name  
> chain. (This is an existing problem that I did not yet fix).

Sure, incremental improvement is good :)

On to the patch:

+  /// LookupPtr - Pointer to a data structure used to lookup
+  /// declarations within this context. If the context contains fewer
+  /// than seven declarations, the number of declarations is provided
+  /// in the 3 lowest-order bits and the upper bits are treated as a
+  /// pointer to an array of NamedDecl pointers. If the context
+  /// contains seven or more declarations, the upper bits are treated
+  /// as a pointer to a DenseMap<DeclarationName, TwoNamedDecls>.
+  llvm::PointerIntPair<void*, 3> LookupPtr;

Very clever.  I never thought of using the low bits as a counter, I'll  
have to remember that :)

+  lookup_result lookup(ASTContext &Context, DeclarationName Name);
+  lookup_const_result lookup(ASTContext &Context, DeclarationName  
Name) const;

Since the impl of the second lookup function is just a const_cast'ing  
forwarding function for the first one, it might be worth putting it  
inline in the header.

+    if (NamespaceDecl *OrigNS =  
dyn_cast_or_null<NamespaceDecl>(PrevDecl)) {
+      // This is an extended namespace definition.
+      // Attach this namespace decl to the chain of extended namespace
+      // definitions.
+      NamespaceDecl *NextNS = OrigNS;
+      while (NextNS->getNextNamespace())
+	NextNS = NextNS->getNextNamespace();

Is it necessary to walk the entire list of namespaces here?  Would it  
be possible to keep the list built and maintained backwards?  Also,  
slightly wonky indentation.

+      // Remove the previous declaration from the scope.
+      if (DeclRegionScope->isDeclScope(OrigNS)) {
+	IdResolver.RemoveDecl(OrigNS);
+	DeclRegionScope->RemoveDecl(OrigNS);
+      }

indentation.

+  // Determine whether the current context supports qualified name
+  // lookup. If so, we may insert this declaration into the current
+  // context.
+  bool SupportQualifiedNameLookup = getLangOptions().CPlusPlus &&
+    (isa<TranslationUnitDecl>(CurContext) ||
+     isa<NamespaceDecl>(CurContext) ||
+     isa<CXXRecordDecl>(CurContext) ||
+     (getLangOptions().CPlusPlus0x && isa<EnumDecl>(CurContext)));

can this be split out into a predicate method on CurContext?

+  if (ScopedDecl *SD = dyn_cast<ScopedDecl>(D))
+    CurContext->addDecl(Context, SD, SupportQualifiedNameLookup);
+  else if (CXXFieldDecl *FD = dyn_cast<CXXFieldDecl>(D))
+    FD->getParent()->insert(Context, FD);
+
    IdResolver.AddDecl(D);
  }

Is there an "else" for this?  If not, it would be good to make it  
assert and die.  If so, it would be good to add a comment.

+    // Perform qualified name lookup into the LookupCtx.
+    // FIXME: Will need to look into base classes and such.
+    for (DeclContext::lookup_const_result I = LookupCtx- 
 >lookup(Context, Name);
+	 I.first != I.second; ++I.first)
+      if ((*I.first)->getIdentifierNamespace() & NS)
+	return *I.first;

indentation on the return.

+  } else {
+    if (getLangOptions().CPlusPlus &&
+	(NS & (Decl::IDNS_Ordinary | Decl::IDNS_Tag))) {
+      // Name lookup for ordinary names and tag names in C++ requires
+      // looking into scopes that aren't strictly lexical, and
+      // therefore we walk through the context as well as walking
+      // through the scopes.

To reduce nesting, please turn this into } else if (...) {

+	// Check whether the IdResolver has anything in this scope.
+	// FIXME: The isDeclScope check could be expensive. Can we do better?
+	while (I != IdResolver.end() && S->isDeclScope(*I)) {
+	  if ((*I)->getIdentifierNamespace() & NS)
+	    return *I;
+	
+	  ++I;
+	}

How about getting the decls context and using that?  Do scopes have an  
associated context?

The while loop might be more natural as a 'for' loop, to pull the ++I  
into the loop condition.

+	// If there is an entity associated with this scope, it's a
+	// DeclContext. We might need to perform qualified lookup into
+	// it.
+	DeclContext *Ctx = static_cast<DeclContext *>(S->getEntity());
+	while (Ctx && Ctx->isFunctionOrMethod())
+	  Ctx = Ctx->getParent();

Can you have more than one function/method context?  If no, please  
convert the while to if.

+	while (Ctx && (Ctx->isNamespace() || Ctx->isCXXRecord())) {
+	  // Look for declarations of this name in this scope.
+	  for (DeclContext::lookup_const_result I = Ctx->lookup(Context,  
Name);
+	       I.first != I.second; ++I.first) {
+	    // FIXME: Cache this result in the IdResolver
+	    if ((*I.first)->getIdentifierNamespace() & NS)
+	      return  *I.first;
+	  }
+
+	  Ctx = Ctx->getParent();
+	}

I think that this would be better to split out as a (potentially  
virtual?) method on the context.  This would make it easier to extend  
with different sorts of contexts.  For example, could  
'enableLazyBuiltinCreation' be handled as a "lookup" on the global (or  
maybe a new 'builtin') scope instead of as a special case hack?

+      // Scan up the scope chain looking for a decl that matches this
+      // identifier
+      // that is in the appropriate namespace.  This search should
+      // not take long, as
+      // shadowing of names is uncommon, and deep shadowing is
+      // extremely uncommon.
+      for (; I != IdResolver.end(); ++I)
+	if ((*I)->getIdentifierNamespace() & NS)
+	  return *I;

Please wrap the comment better, and avoid evaluating .end() every time  
through the loop.

            if (OldDecl == PrevDecl) {
              // Remove the name binding for the previous
+            // declaration.
+	    //	    if (cast<ScopedDecl>(PrevDecl)->getLexicalDeclContext()  
== CurContext) {
+	    if (S->isDeclScope(PrevDecl)) {

commented out code.

+	  // Add this declaration
+	  if (getLangOptions().CPlusPlus)
+	    CurContext->addDecl(Context, NewFD, false);

+	  // In C++, check default arguments now that we have merged decls.
+	  if (getLangOptions().CPlusPlus)
+	    CheckCXXDefaultArguments(NewFD);

should these be merged into one "if"?

    const RecordType *BaseRecord = Base->getType()->getAsRecordType();
+  DeclContext::lookup_const_result Lookup
+    = cast<CXXRecordType>(BaseRecord)->getDecl()->lookup(Context,  
OpName);

This will go away when C and C++ use the same lookup for struct  
fields, right?

+DeclContext::lookup_result
+DeclContext::lookup(ASTContext &Context, DeclarationName Name) {
..
+    }
+  } else {

How about "return Result" at the end of the "if".  This lets you  
unnest the 'else' clause.

+DeclContext::lookup_iterator
+DeclContext::insert(ASTContext &Context, NamedDecl *D) {
+  DeclContext *PrimaryContext = getPrimaryContext(Context);
+  if (PrimaryContext != this)
+    return PrimaryContext->insert(Context, D);
+
+  unsigned Size = LookupPtr.getInt();
+  if (Size < LookupIsMap) {

There are various places where you compare against LookupIsMap in  
different ways (some using != some using <), how about introducing a  
private isLookupMap() method or something like that?  In this case,  
the fact that LookupIsMap is the largest value really doesn't have  
anything to do with the algorithm.  If you don't like the predicate  
method, please change it to use !=.

Another option would be to pull the array/map code into its own  
generic class (out of the DeclContext class), which might be the real  
best solution.

+    for (Match = 0; Match < Size; ++Match) {
+      if (Array[Match]->getDeclName() == D->getDeclName())
+	break;
+    }

indentation, here and in a couple other places in the file indentation  
gets down to one space.

+      if (Match < Size && (NS == Decl::IDNS_Tag) &&
+	  (Array[Match]->getIdentifierNamespace() & Decl::IDNS_Ordinary))

should have another space of indentation on the second line.

*ahhh*, this is because there are tabs in the file!  Please zap them! :)

Overall, I love this change.  Great job Doug,

-Chris

Finally, when the dust settles on name lookup, can you please add an  
overview to the Internals doc? :)  The issues are nuanced enough that  
a high level overview would be very useful.

-Chris