[cfe-dev] Expr::getSourceRange() doesn't return full range

Stephen Deken stephen.deken at gmail.com
Wed Mar 17 11:46:43 PDT 2010


I've got a subclass of clang::Action (actually a subclass of
clang::Sema even though it's currently private) which provides an
implementation of ActOnExprStmt.  The Action::FullExprArg that gets
passed to this function is castable to a clang::Expr* (courtesy of
Sema, I believe).  It has a reference to a SourceManager in a member
variable called SM.

I have the following code for ActOnExprStmt (omitting error checking
for clarity):

    virtual Action::OwningStmtResult ActOnExprStmt( Action::FullExprArg expr )
    {
        Expr *E = reinterpret_cast<Expr*>( expr->get() );
        SourceRange sr = E->getSourceRange();
        const char* pBegin = SM.getCharacterData( sr.getBegin() );
        const char* pEnd = SM.getCharacterData( sr.getEnd() );
        cout <<"got expr: " <<std::string( pBegin, pEnd ) <<endl;
        return Sema::ActOnExprStmt( expr );
    }

When I parse code like the following,

    void foo()
    {
        int a;
        a = (2 + 2) - 100;
    }

the assignment expression `a = (2 + 2) - 100` is passed into my
ActOnExprStmt function, so I would expect to see the following output:

    got expr: a = (2 + 2) - 100

Instead, I see this:

    got expr: a = (2 + 2) -

There is a trailing space.  Obviously, SourceRange::getEnd() is
returning the beginning of the last token in the range.

I was able to get the intended behavior (for this specific case) by
adding a second SourceLocation instance to the IntegerLiteral class,
adding a constructor that takes an additional SourceLocation, and
changing the call within Sema::ActOnNumericConstant() to call the new
constructor using
Tok.getLocation().getFileLocWithOffset(Tok.getLength()) as the end
location.

The same thing happens with, for example, ActOnDeclarator().  The
Declarator& instance passed to this function has a getSourceRange()
method, but the end of the range is the beginning of the identifier,
e.g., I just get the "int " part of "int a", or everything up to the
last paren of "int (*b)( void )".

This makes me sad :(

Am I meowing down the wrong shrubbery here?  It doesn't seem like it
ought to be this hard to grab the full source text of an expression.

Options:

    1) Just don't do this ("Bad Stephen!  Bad!")
    2) Modify all of the Expr subclasses to both
        a) add a second SourceLocation (or just using a SourceRange tuple)
        b) modify/add a constructor that populates that SourceLocation
    3) Something blindingly obvious that I'm blissfully unaware of

Any pointers?  Preferably non-null?

-- 
Stephen Deken
stephen.deken at gmail.com



More information about the cfe-dev mailing list