[cfe-dev] AST UnaryOperator subexpression

Chris Lattner clattner at apple.com
Tue Apr 22 20:18:39 PDT 2008


On Apr 22, 2008, at 3:40 PM, Petr Kostka wrote:
> I am trying to walk the AST tree and wrap all the dereferenced  
> expressions
> "*expr" with macro, e.g. "*DEREF_EXPR(expr)", but I get strange  
> behavior with
> following code snippet that should do the wrapping:

Ok!

> const UnaryOperator *pUnaryOperator = dyn_cast<const  
> UnaryOperator>(pExpr);
> assert(pUnaryOperator);
> if (UnaryOperator::Deref == pUnaryOperator->getOpcode())
> {
>    const Expr *pSubExpr = pUnaryOperator->getSubExpr();
>    mRewriter.InsertCStrBefore(pSubExpr->getSourceRange().getBegin(),
> "DEREF_EXPR(");
>    mRewriter.InsertCStrAfter(pSubExpr->getSourceRange().getEnd(),  
> ")");
> }

This is almost exactly right.

> When the subexpression of the dereference operator (*) is another  
> compound
> expression, e.g. "(bar+1)" in following code sample
>
> void foo(void) {
>    unsigned *bar = 0;
>    *(bar+1) = 1;
> }
>
> I get the expected result
>
> void foo(void) {
>    unsigned *bar = 0;
>    *DEREF_EXPR((bar+1)) = 1;
> }

Actually, you're getting the wrong result here.  The trick is that it  
is inserting the new ")" *before* the old one.  Because the new and  
old token are the same, it happens to render correctly.

> But when the subexpression is leaf token
>
> void foo(void) {
>    unsigned *bar = 0;
>    *bar = 1;
> }
>
> I get
>
> void foo(void) {
>    unsigned *bar = 0;
>    *DEREF_EXPR()bar = 1;
> }
>
> instead of expected
>
> void foo(void) {
>    unsigned *bar = 0;
>    *DEREF_EXPR(bar) = 1;
> }

Right.  This occurs because 'bar' is not a ')'.  In both cases, you're  
inserting the token *before* the last token in the range.

> I would assume that pSubExpr->getSourceRange().getBegin() and
> pSubExpr->getSourceRange().getEnd() source locations should point  
> BEFORE and
> AFTER the subexpression, but it does not work as expected. In case  
> of leaf token
> subexpression both point to the same source location (before the  
> token). What am
> I doing wrong?

Actually they don't.  The trick here is that "end" points to the  
*start* of the last token in the range.  This makes construction of  
the source ranges much more clean and simple, but pushes some logic  
into the clients.  Basically, to insert text after the end of the  
range, you have to add the length of the last token.  Luckily, this is  
really easy to get :)

The HTML rewriter uses code like this:

   SourceManager &SM = ...
   SourceLocation E = Range.getEnd();

   // If E is a macro expansion, we want the instantiation location.   
You should determine how you want to handle this.  There are many  
possible strategies.
   E = SM.getLogicalLoc(E);

   // Add the size of the end token.
   E = E.getFileLocWithOffset(Lexer::MeasureTokenLength(E, SM));
   mRewriter.InsertCStrAfter(E, ")");

Please let me know if that doesn't work.

-Chris



More information about the cfe-dev mailing list