[cfe-dev] Cleaning up the representation of Decls in the AST

Sat Aug 23 19:06:49 PDT 2008

On Aug 22, 2008, at 5:23 PM, Ted Kremenek wrote:

>> There have been multiple requests for a way to get the location of  
>> the
>> type of a variable declaration; do you have any ideas for how to
>> implement that?  It seems like DeclGroups will make that easier, but
>> it still seems tricky.
>
> It is tricky, and I'm not certain if it is even a well-formed
> question.  Consider:
>
> int (*x)(int y);
>
> Where is the location of the type?  Is it the first int?

I think that this (tracking loc info for types) is easy to solve, just  
a significant amount of work :).  I don't think this has anything to  
do with DeclGroup though.

The reason we don't keep loc info for types around today is that types  
are uniqued.  Conceptually, if types were not uniqued, 'x's type in  
Ted's example would just be a PointerTypeWithLocInfo with a pointee of  
FunctionTypeWithLocInfo, with arg and return type of  
BuiltinTypeWithLocInfo, etc.  You could even throw in a  
GroupingParenTypeInfo object if desired.

This would make the type system capture the full loc info for the  
declaration and I think it would be extensible to the other type- 
related loc and syntactic structure info that various clients needed.   
The big problem with this approach is that it would make type  
comparisons very slow, because you'd have to recursively compare types  
by their structural type.

Fortunately, there is an answer here and it's an easy one :).  The  
basic approach I would recommend is to build off the canonical types  
system.  You could define *today* all these classes, and the canonical/ 
desugared version of the type would just be the normal type without  
location info.  This is exactly how we represent "typeof(int)" in the  
type system separately from "int" even though both have the same  
semantic meaning.

This can also be used for a number of other things that we don't do  
today.  For example, in ObjC, you can write a protocol qualified type  
with the protocols in any order, e.g. NSString<a,b,c> is the same as  
NSString<c,a,b>.  Today, we just always discard the original ordering  
and sort them alphabetically.  This means that if we emit a diagnostic  
with one of these types that we print them out in the wrong order  
(alphabetic instead of the order the user wrote them in).  If we  
really cared, we could have a 'noncanonical' version of this that  
preserves the original ordering, and have the canonical version of the  
type store the qualifiers in sorted order.

I anticipate that the same approach will be used in the future to  
distinguish between "std::basic_string<char>" and  
"basic_string<char>" (when using namespace std) and  
"basic_string<char, type_traits<char> >" when the user explicitly  
spells out the default parameter, etc.

Tracking this information would be expensive (which is just the nature  
of the problem since types are so 'intricate'), but could be a  
conditional in SemaType.  Clients that wanted it could parse with that  
condition enabled and get full loc info for their types.

-Chris