[cfe-dev] Type source info proposal

Thu Aug 13 17:41:32 PDT 2009

Hi,

This is a more detailed proposal for adding source info about types.

The basic idea is that source information about a "declarator type" (a  
type coming out of declarator parsing) will be stored in a flat  
contiguous memory block
that will be interpreted based on the type. e.g for:

MyType **

we have a pointer -> pointer -> typedef and the flat memory block will  
contain

-source location for pointer star
-source location for pointer star
-source location for typedef name

Source information will contain other stuff besides source locations,  
e.g:

MyType * x[N+1]

the flat block will contain:

-source loc for '['
-source loc for ']'
-Expr* for size expression
-source loc for '*'
-source loc for typedef name

For function types:

void (*f1)(int *x, int *y); #1
void f1(int *x, int *y); #2

type source info will contain the ParmVarDecls.
We can probably have the FunctionDecl created from #2 share this  
ParmVarDecls array with type source info.

Now where should this information be stored ?

One idea was having a special Type subclass that will also keep source  
info for a declarator and will be non-canonical (like TypedefType).
It would also enter the type system, in that types of parameters of a  
FunctionType will be this special Type (since a parameter comes out of  
a declarator).

This was not a good idea because of 2 issues:

1) Types start being created & uniqued unnecessarily, e.g:

int *x;  #1
int **y = &x; #2

For #1 we create a new PointerType to keep source info, and for the  
"&x" expression we also create & unique another PointerType, instead  
of using the unique "pointer to pointer to int" type.

2) Types change because of semantic analysis, they decay, merge,  
change because of attributes, etc. You can't really have source info  
"hanging off" a Type.

What I propose instead is that type source info is decoupled from the  
actual Type that the declarator resolved to;
type source info should be stored into the Decls (FieldDecl, VarDecl,  
etc.) and Exprs that contain types (like SizeofAlignofExpr).

We call "type source info" "DeclaratorInfo" and we create a new Decl  
subclass "DeclaratorDecl", which contains a DeclaratorInfo* pointer.
It will enter the hierarchy like this:

ValueDecl -
         EnumConstantDecl
         DeclaratorDecl -
               FieldDecl -
               VarDecl -
               FunctionDecl -

That way, EnumConstantDecl (which has no use for DeclaratorInfo* at  
all) will not change size.

In order to read DeclaratorInfo, you will use "TypeLoc" wrappers to  
get at the information, e.g:

DeclaratorDecl *DD = cast<DeclaratorDecl>(ASTLoc.getDecl());
DeclaratorInfo *DInfo = DD->getDeclaratorInfo();
TypeLoc TL = DInfo->getTypeLoc();

if (FunctionLoc *FTL = dyn_cast<FunctionLoc>(&TL)) {
     // Print info about the function declarator
      FTL->getLParenLoc().print(OS, SrcMgr);
      FTL->getRParenLoc().print(OS, SrcMgr);
} else if (ArrayLoc *ATL = dyn_cast<ArrayLoc>(&TL)) {
     // Print info about the array declarator
      ATL->getLBracketLoc().print(OS, SrcMgr);
      ATL->getRBracketLoc().print(OS, SrcMgr);
}

Now, for a given declarator we have both a QualType and a  
DeclaratorInfo* and we want to pass them both to the Parser, while
the Parser operates on single pointers (e.g. will store the type  
pointer that it got from Sema into an annotation token).
For that we create a "special" Type subclass ("LocInfoType") that  
keeps DeclaratorInfo* and whose purpose is *only* for passing back and  
forth between Parser & Sema,
in will *not* participate into the type system semantics at all.

Currently, LocInfoType gets created by ASTContext and consumes memory,  
but since it is only "transient", intended for the Parser/Sema  
interaction,
we can do clever stuff like destroy/cache them when Sema knows that a  
declaration is finished so that they don't consume memory.

Ok, this is the high level overview, I've attached incremental patches  
of this implementation:

typeinfo1.patch : Introduce TypeLoc and DeclaratorInfo
typeinfo2.patch : Introduce DeclaratorDecl and pass DeclaratorInfo  
through the Decl & Sema interfaces.
typeinfo3.patch : Actually build the DeclaratorInfo out of a parsed  
declarator
typeinfo4.patch : Introduce LocInfoType
typeinfo5.patch : Pass type source info through the Parser using  
LocInfoType

Currently there is no flag to enable/disable type source info but can  
be added easily. In general, I would strongly prefer that we always  
keep type source info
since, apart from getting a complete AST, we will also be able to get  
rid of the "type specifier start location" SourceLocation that Field/ 
Var/Functions have,
get rid of ConstantArrayWithExprType, and maybe simplify other things  
that I'm missing.

I'd really appreciate any feedback that you may have on the above;  
feel free to ask me any questions.

-Argiris