[cfe-dev] Qualified and unqualified name lookup and IdResolver

Mon Dec 8 11:38:54 PST 2008

Hi Doug,

Douglas Gregor wrote:
> On Dec 4, 2008, at 7:58 PM, Argiris Kirtzidis wrote:
>> I like the idea, for unqualified lookup, of declaration chains based 
>> on lexical scope plus fall back to qualified lookup.
>
>> But for qualified lookup, did you consider using the "array+hashtable 
>> efficient data structure" not for answering "does N contain x ?" (#1) 
>> but for answering "is x contained in N ?" (#2) ?
>> (I mean an identifier having that data structure for storing the 
>> DeclContexts where it appears).
>> Both cases may have to deal with large number of declarations; #1 may 
>> deal with namespaces with lots of decls, while #2 may deal with decls 
>> contained in lots of namespaces.
>> The difference is that #1 optimizes for namespaces with few 
>> declarations while #2 optimizes for "differently named" declarations 
>> which is more common.
>> Also, #2 will immediately answer for not-declared-before identifiers, 
>> without requiring any lookup.
>
> I did think about #2, but I don't think it's the right architecture 
> and I think there are more performance issues lurking, particularly in 
> the case where we are instantiating many templates that all contain 
> members with the same names.
>
> On the architecture side: the problem with having a data structure 
> hanging off the identifiers is that the data structure has global 
> knowledge: it needs to know about every "x" in every context. That 
> makes it hard to build up the data structure in any way other than 
> directly parsing the translation unit. For example, a module system 
> that supports lazy deserialization might be careful to only record 
> that the identifier "std" exists in the global namespace. When someone 
> writes "std::vector", qualified lookup into std forces "std" to be 
> populated by loading its declarations (but not the declarations they 
> contain). It would be quite easy to implement this with #1, since 
> DeclContext can be taught to deserialize when something performs a 
> lookup() on it. Perhaps we could do something similar with #2, but it 
> seems less obvious to me.
>
> But, much of my bias comes from the fact that I find #1 to be far 
> easier to reason about.

That's all very reasonable. If only the C++ front-end were complete 
enough to parse real code, we would be able to run tests for comparison.

>
> Overall, I think the decision between #1 and #2 would actually have a 
> very small impact on the compiler. Most of the work in my patch is in 
> making the declaration chains contain only identifiers in our lexical 
> scope, and the "perform qualified lookup of name 'x' in scope 'y'" 
> part is relatively separate.

It would be great if this part can remain simple and separate so that 
it's easy to try out #2 sometime to see how it impacts performance.

-Argiris