[cfe-dev] About AST rewriting / manipulation

Mon Feb 2 08:40:27 PST 2009

Thanks for the response,
I've chosen the simpler way, I just create clang nodes from scratch and 
replace / insert these nodes with the old one in the syntax tree!  At 
the end it's not so complex as I thought... but now I have another 
problem! I want to write back the modified syntax tree but no matter 
which kind of changes I made to the syntax tree... when I use the 
TokenRewriter:

const LangOptions &LangOpts = Ctx.getLangOptions();
TokenRewriter Rewriter(Ctx.getSourceManager().getMainFileID(), 
Ctx.getSourceManager(), LangOpts);

// Print out the output.
for (TokenRewriter::token_iterator I = Rewriter.token_begin(), E = 
Rewriter.token_end(); I != E; ++I)
        out << pp.getSpelling(*I);

I still get the original code! How can I rewrite a syntax tree to Source 
file? Is there something in the API that I am missing? Should I use the 
Rewriter? If yes, why the rewriter doesn't provide an InsertStmt method? 
With the current API I can add text... and replace statements... what 
about inserting statements? and... remove statements?

thanks, S. Pellegrini

Douglas Gregor wrote:
> Hello Simone,
>
> On Jan 23, 2009, at 8:42 AM, Simone Pellegrini wrote:
>> I am trying to use Clang as a source-to-source compiler. Through the API
>> I've found the way to rewrite back the syntax tree into source code, and
>> that's not difficult. However, before writing back the syntax tree I
>> would like to manipulate the syntax tree in order to apply some code
>> transformations.
>>
>> For example I would like to rewrite something like f(a,b) into g(b, 
>> a, c)
>
> Okay.
>
>> Now I guess I should create the AST nodes I need (building a new
>> CallExpr... object and so on...) and then substitute the old f(...) with
>> the new g(...). The Clang API for creating AST nodes is nevertheless
>> quite complex to use, it's really too demanding.
>
> Interesting. I guess the demanding part of the API is that you need to 
> be careful to ensure that you build semantically-correct ASTs.
>
>> Now, I am wonder that It would be very nice if I could write the
>> statement I want to substitute (or to add) as a string and then use the
>> the Clang parser to create the syntax tree of the piece of code I have
>> written in a way it can be easily plugged in the old main syntax tree
>> (of course the new instance of the parser should be invoked considering
>> the previous context...). Is it possible to have this kind of behavior?
>
> I believe it is possible to extend Clang to do this, but there is no 
> API to do so right now. The parser can be handed a set of tokens and 
> told to "go parse these" by calling into the appropriate parse 
> function; we do this to implement some C++ semantics, such as inline 
> definitions of member functions.
>
> However, the hard part---that nobody has even thought about how to 
> implement---is that you would need to be able to take an AST node and 
> instruct the parser *and semantic analysis* to set its internal state 
> to the point where that AST node was parsed. That means reconstructing 
> the scope stack, the information about which identifiers bind to which 
> declarations, and so on. Not all of this information is present in the 
> AST, so this is a major undertaking.
>
>     - Doug