[cfe-dev] AST transformations

Michael Boyer mwb7w at cs.virginia.edu
Thu Mar 10 07:37:44 PST 2011


I am trying to use Clang to analyze and modify source code at the AST
level. The class that seems most relevant for AST analysis is
RecursiveASTVisitor. I have seen some comments on this list indicating
that the AST is immutable once created. My understanding is that a
RecursiveASTVisitor would be called _after_ AST creation, making it
difficult or impossible to modify the AST using this interface.

What classes should I be looking at instead? I know that the
RewriteObjC class uses ASTConsumer; however, looking at the source
it's not immediately clear to me whether it is transforming the AST or
just inserting extra declarations/statements/etc. at the source level.

Any suggestions/comments would be much appreciated.



Here is more context if anyone is interested:

I am attempting to use Clang to analyze array access patterns. The
basic goal is to extract the array offset expression for each write or
read to an array. This by itself seems pretty trivial (find each
ArraySubscriptExpr node in the AST and print it out). However, an
added complication is that the expressions can only contain certain
types of values that can easily be reasoned about. The specific
context I am interested in is OpenCL, so the types of allowable values
are things like built-in OpenCL functions (get_global_id, etc.),
kernel arguments, constants, etc.

Example (I have removed much of the OpenCL-specific syntax for simplicity):

void kernel(int *output, int width) {
  int index = get_global_id(1) * width + get_global_id(0);  //
get_global_id is a function that I can reason about
  output[index] = index;
}

For this kernel/function, the desired output would be something
similar to the following:
offset = get_global_id(1) * kernel_argument_1 + get_global_id(0);

The way I would solve this problem conceptually would be:
- Traverse the AST looking for assignments.
- When an assignment (LHS = RHS) is found, propagate the RHS down the
AST (in other words, replace any future uses of LHS with RHS).
- Continue until the children of all ArraySubscriptExpr nodes are from
a restricted class of values.

Obviously there are some other complications, like how to deal with
loops, etc. For now I am just worried about supporting some simple
kernels.

Thanks,
Michael



More information about the cfe-dev mailing list