[cfe-dev] Should we build semantically invalid nodes?

Sun Oct 26 12:16:57 PDT 2008

On Oct 26, 2008, at 2:39 PM, Chris Lattner wrote:
>
>>> If we really wanted this sort of thing, it  seems like it would be  
>>> cleanest to do what Steve said: define a new  actions module that  
>>> just builds an AST (which can even use the same or  an extended  
>>> set of nodes as Sema) but doesn't do any real checks,  doesn't  
>>> assign types, etc.  At this point, you have more parse tree  than  
>>> an AST.
>>
>> This will be a maintainance burden; I'm pretty sure such an action  
>> module will eventually bitrot and become irrelevant since all the  
>> focus will be on the Sema AST.
>
> You're right, one example is the '-parse-print-callbacks' option  
> which was out of date almost as soon as it was started :).  However,  
> if there is a well maintained client, this wouldn't happen.
>
>> The current AST has lots of syntactic information (apart from the  
>> missing "TypeSpecifier" node), there's no need for another one.
>> If it's possible to combine a ASTBuilder action with the Sema  
>> action like I suggest here:
>> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-October/003125.html
>> it will result in an ASTBuilder that produces the syntactic AST,  
>> and a Sema that uses it and emits the necessary diagnostics and  
>> possible rejects invalid nodes. It may even help in the  
>> maintainability department.
>
> I'm still struggling to figure out what problem you're trying to  
> solve.
>

I'm missing this as well, however I can see why it is seductive to try  
and reuse the AST's in as many contexts as possible. I also sympathize  
with the desire to simplify and modularize Sema. While  
ActOnDeclarator() is not quite as hideous as GCC's grokDeclarator(),  
it's still very complex. Note that the complexity might be hard to  
avoid given C's brilliance of "declaration models use":-).

I think any breakthrough in modularizing Sema (or not) will come from  
developing new clients. In fact, although the Action model is simple,  
it's rarely applied to compilers. As you know (but others may not),  
this pattern was applied to parsing as a result of doing a nifty  
precompiled header scheme developed @ NeXT in the early 90's (where we  
benefitted from having a reusable parser). At this point in clangs  
lifetime, it's unclear if we will develop other critical Action  
modules. Fortunately, the layering doesn't cost us much (in terms of  
performance) and we can hopefully benefit from this in the future  
(when we decided to tackle more advanced forms of recompilation). If  
not, and the AST's end up solving all of our problems, we can  
certainly remove the abstraction.

snaroff

> -Chris