[cfe-dev] AST transformations

Vassil Vassilev vasil.georgiev.vasilev at cern.ch
Thu Mar 10 11:32:00 PST 2011


Hi,
On 10.3.2011 г. 17:37, Michael Boyer wrote:
> I am trying to use Clang to analyze and modify source code at the AST
> level. The class that seems most relevant for AST analysis is
> RecursiveASTVisitor. I have seen some comments on this list indicating
> that the AST is immutable once created. My understanding is that a
> RecursiveASTVisitor would be called _after_ AST creation, making it
> difficult or impossible to modify the AST using this interface.
>
> What classes should I be looking at instead? I know that the
> RewriteObjC class uses ASTConsumer; however, looking at the source
> it's not immediately clear to me whether it is transforming the AST or
> just inserting extra declarations/statements/etc. at the source level.
>
> Any suggestions/comments would be much appreciated.
>
I ran into the same problem, when I was looking for such tool. I think 
the RecursiveASTVisitor is meant to be a tool for traversing the tree 
not for transforming or mutating it. Thus it is very difficult to use it 
for such kind of stuff. However, you can have a look at the clang::(Stmt 
| Decl | Type)Visitor classes. They provide pretty powerful interface 
and you can make your own transformation utility.

Here is what I did, I derive from clang::StmtVisitor and 
clang::DeclVisitor (both of them have RetTy), which I set to 
clang::Stmt* and clang::Decl* respectively. Then I traverse the 
TranslationUnit top-down and I can return whatever AST node I want. Like 
that you can replace the every specific type of AST node you want, when 
going up. Even you can send to the parent more complex structure (than 
clang::Stmt*, clang::Decl), containing complex context, which gives 
hints to the parent how to deal with the returned node.

Note that it is extremely dangerous, because you can break the semantics 
of the AST. Keep in mind, that if you plan to inject new artificially 
created nodes they wouldn't have SourceLocation, etc, which causes 
problem with the accuracy of the diagnostics, for example. Even more, if 
you inject new node, you should make Sema think that it actually comes 
from the Parser, which is difficult because you don't have source, right?

Cheers,
Vassil



More information about the cfe-dev mailing list