[cfe-dev] Should we build semantically invalid nodes?

Argiris Kirtzidis akyrtzi at gmail.com
Sat Oct 25 13:27:49 PDT 2008


Here's some follow up thoughts:

Trying to find abstractions or some other AST appropriate for clients 
will not work without a concrete client that demands them, that 
discussion will be too "fuzzy". New Action modules can always be built 
in the future.
Currently, what we could do, is focus on whether it's possible (and 
worth it) to refactor Sema, gradually and not disruptive, and 
"componentize" it for better maintainability.

Suppose that we want to "pluck" AST building out of Sema into a new 
component (say "ASTBuilder"). The semantics and the AST produced will be 
the same as now, Sema will not allow invalid nodes.
Here's a non-disruptive way that may work, this is Sebastian's 
ActOnCXXCasts, slightly modified:

/// ActOnCXXCasts - Parse {dynamic,static,reinterpret,const}_cast's.
Action::ExprResult
Sema::ActOnCXXCasts(SourceLocation OpLoc, tok::TokenKind Kind,
                    SourceLocation LAngleBracketLoc, TypeTy *Ty,
                    SourceLocation RAngleBracketLoc,
                    SourceLocation LParenLoc, ExprTy *E,
                    SourceLocation RParenLoc) {
  CXXCastExpr::Opcode Op;
  Expr *Ex = (Expr*)E;
  QualType DestType = QualType::getFromOpaquePtr(Ty);

  switch (Kind) {
  default: assert(0 && "Unknown C++ cast!");
  case tok::kw_const_cast:
    Op = CXXCastExpr::ConstCast;
    if (CheckConstCast(OpLoc, Ex, DestType))
       return true;
    break;
  case tok::kw_dynamic_cast:
    Op = CXXCastExpr::DynamicCast;
    break;
  case tok::kw_reinterpret_cast:
    Op = CXXCastExpr::ReinterpretCast;
    if (CheckReinterpretCast(OpLoc, Ex, DestType))
       return true;
    break;
  case tok::kw_static_cast:
    Op = CXXCastExpr::StaticCast;
    break;
  }

  return ASTBuilder.ActOnCXXCasts(OpLoc, Kind, LAngleBracketLoc, Ty, 
RAngleBracketLoc, LParenLoc, E, RParenLoc);
}


Right now, just one ASTBuilder::ActOnCXXCasts method can be added to 
move the expression creation out of Sema, and it will not affect 
anything else.
Suppose that gradually more ASTBuilder methods are added and ASTBuilder 
becomes complete enough to work as an Action; then if the Parser used 
ASTBuilder directly it would create the cast expression node without the 
semantic checks.

Chris Lattner wrote:
> What sort of clients would benefit substantially from a broken and  
> partially formed AST?  If we really wanted this sort of thing, it  
> seems like it would be cleanest to do what Steve said: define a new  
> actions module that just builds an AST (which can even use the same or  
> an extended set of nodes as Sema) but doesn't do any real checks,  
> doesn't assign types, etc.  At this point, you have more parse tree  
> than an AST.  I could imagine that something like this would be  
> useful, but can't think of any specific clients.
>   

Yes, that would be ideal but it's not clear how practical it is to build 
and maintain another Action module that deals with the C++ type system 
and/or produces mostly the same AST as Sema but with no checks. The AST 
building can be gradually factored out into a ASTBuilder that Sema uses 
with the benefits of:

-Better "componentization", the bulk of semantic checks will be in Sema, 
while the AST creation stuff in ASTBuilder
-No code duplication will be needed; trying to have multiple action 
modules dealing with templates might not be much fun.
-With a eventually complete ASTBuilder there will be a ready to go, 
fully program-describing, no-expression-left-behind, AST producing 
action module, which will always be accurate and maintained since this 
will be what Sema uses too. It won't be in danger of bitrotting because 
of lack of attention.

Any thoughts about the above ?

-Argiris



More information about the cfe-dev mailing list