[cfe-dev] Understanding Clang parsing

Charles Davis cdavis at mymail.mines.edu
Wed Mar 3 20:41:33 PST 2010


On 3/3/10 8:07 PM, Salman Pervez wrote:
> This is something I would like to learn more about as well. For  
> instance, the file lib/Parse/ParseObj.c contains an entire list of  
> tokens e.g. 'kw_if', 'kw_new'. I am assuming the lexer reads these  
> tokens and prepares them for the parser. Could someone point me to  
> where these 'kw_*'  enums are defined?
Believe it or not, they're defined as part of the Basic library. See
include/clang/Basic/TokenKinds.def.
> 
> What would be really helpful is if someone could give a brief overview  
> of how I would go about adding a new expression to C. So far what I've  
> learned is this...
> 
> - I would have to add the relevant token so the lexer can recognize it.
Only if you have a new keyword or some such to add.
> - I would have to add parser code in lib/Parse/Parser.cpp?
Not there, but to the relevant source file--probably
lib/Parse/ParseExpr.cpp.
> - I would have to construct the relevant AST for this expr.
Look at the AST library--particularly lib/AST/Expr.cpp and friends.
> 
> If I could just get the names of the files/directories where these  
> changes would need to be made, that would be a great starting point.  
You'll also have to add a new action to the Action interface
(include/clang/Parse/Action.h), and you'll also have to modify Sema to
understand the new expression (if you intend to use Sema; see
lib/Sema/SemaExpr.cpp). If you want to generate IR from it, you may also
have to modify CodeGen (lib/CodeGen/CGExpr.cpp) to understand the new
AST node. If you want to do static analysis, you may need to modify the
Analysis library, etc.

Chip



More information about the cfe-dev mailing list