[LLVMdev] hacking clang IdentifierTable

John McCall rjmccall at apple.com
Mon Jul 26 00:09:02 PDT 2010

On Jul 25, 2010, at 11:27 PM, Yabin Hu wrote:
> Clang use a hash table to store all its identifiers. The hash table definition is:
> typedef llvm::StringMap<IdentifierInfo*, llvm::BumpPtrAllocator> HashTableTy;
> HashTableTy HashTable;
> Can anyone explain the mechnism of handling the name string key collision for me? Is there a IdentifierInfo objects chain or list for 
> variable or function with the same name?

This hash table is just used to unique IdentifierInfos.  IdentifierInfo itself holds a client-specific pointer of storage;  Sema uses that pointer to hold either a NamedDecl* or an IdDeclInfo*, which we distinguish with the usual low-bit tagging trick.  IdDeclInfo is basically a SmallVector<NamedDecl*,2>.  There's some extra magic to make macros/PCH not require redundant hashes, but that's the basic idea.

Anyway, this is all sufficient to implement unqualified lookup in C and Objective C, but our C++ implementation can only get away with relying on IdentifierInfo when looking in a function;  we have to fall back on DeclContext lookups (i.e. rehashing the string in different hash tables) when looking into pretty much any other kind of scope.


