[cfe-dev] Cleaning up the representation of Decls in the AST

Fri Sep 12 08:29:37 PDT 2008

On Sep 12, 2008, at 5:45 AM, steve naroff wrote:

> That said, I'm not sure I understand your point that "Decls just  
> represent syntax". Since your argument seems to revolve around it, I  
> need some clarification.
>
> From my perspective, Decls don't just represent syntax. For example,  
> VarDecl has a method "getInit()" which returns an expression. The  
> initializer is an important part of the semantics of the program.  
> There are many other examples...

In truth the ASTs are a mix of both syntax and semantics, but in my  
mind syntax is the program itself and semantics is its meaning.  Under  
this interpretation, declarations, expressions, etc., which map  
directly to lexical/grammatical constructs in the program are the  
syntax.  This is what the parser understands.  The parser has a notion  
of expressions and declarations, but has no real notion of types.  The  
type checking, which is done as part of Sema, is part of the semantic  
interpretation of the program.  For example:

   x + y

 From the parser's point of view, "x + y" is an opaque expression.  It  
has no notion of the type of this expression, nor its meaning other  
than it is an expression.  Our parser actually doesn't even know it's  
a binary expression, but it conceptually could know that its an  
expression made up of a subexpression representing "x" and another  
subexpression representing "y."

Without type information, however, we do not know what the evaluation  
of this expression would produce.  "x" could be a C++ object that has  
operator+ defined, or a float, and int, etc.

I know this all sounds pedantic, so I'll try and be a little more  
clear of why this matters.  When I look at the structure of the ASTs,  
the class hierarchy of Decls and Stmts and so on, I can see the C/C++  
grammar encoded in those AST nodes.  The type information is affixed  
to individual AST nodes, but that's a matter of implementation of how  
we represent a mapping from expressions to types.  In my mind the  
structure of the ASTs represent the syntax, and then we have some  
relations (I'm using the mathematical definition of "relation" here)  
that map from Expr* to Type*.  To me the type information is an  
"interpretation" of the meaning of the syntax (i.e., the semantics);  
it's an extremely important one that we always want to have around,  
but it's an interpretation nevertheless.

So, from this reasoning, in my mind its not conceptually clean to have  
the interpretation of the syntax (i.e., the types) be responsible for  
owning (in the memory management sense) elements of the syntax (Exprs,  
Decls, and so on).