[LLVMdev] hacking clang IdentifierTable

John McCall rjmccall at apple.com
Mon Jul 26 00:09:02 PDT 2010


On Jul 25, 2010, at 11:27 PM, Yabin Hu wrote:
> Clang use a hash table to store all its identifiers. The hash table definition is:
> 
> typedef llvm::StringMap<IdentifierInfo*, llvm::BumpPtrAllocator> HashTableTy;
> HashTableTy HashTable;
> 
> Can anyone explain the mechnism of handling the name string key collision for me? Is there a IdentifierInfo objects chain or list for 
> variable or function with the same name?

This hash table is just used to unique IdentifierInfos.  IdentifierInfo itself holds a client-specific pointer of storage;  Sema uses that pointer to hold either a NamedDecl* or an IdDeclInfo*, which we distinguish with the usual low-bit tagging trick.  IdDeclInfo is basically a SmallVector<NamedDecl*,2>.  There's some extra magic to make macros/PCH not require redundant hashes, but that's the basic idea.

Anyway, this is all sufficient to implement unqualified lookup in C and Objective C, but our C++ implementation can only get away with relying on IdentifierInfo when looking in a function;  we have to fall back on DeclContext lookups (i.e. rehashing the string in different hash tables) when looking into pretty much any other kind of scope.

John.



More information about the llvm-dev mailing list