[cfe-dev] confusing getLocEnd() behavior

Sergejs Belajevs sergejs.belajevs at gmail.com
Wed Jul 11 21:45:19 PDT 2012


Hi,

I am working on source-to-source transformation tool and want to get
the original source code for a statement token by token. I am using
statement's getLocStart/getLocEnd, SourceLocation's getLocWithOffset,
SourceManager's getCharacterData and Lexer::MeasureTokenLength. My
code worked fine until I ran into some DeclStmts:

1) struct A { int a; } s;
2) struct A { int a; };
3) union A { int a; };
4) union { int a; };

For 1) getLocEnd() works fine.
For 2) my code doesn't work because getLocEnd() is smaller than
getLocStart(). End's getRawEncoding() returns 0. I found a workaround
for this case by calling getLocEnd() of DeclStmt's getSingleDecl().
Case 3) has the same problem as 2), same workaround works fine.
Case 4) has the same problem as 2), but this time after applying my
workaround the resulting SourceLocation is the same as getLocStart(),
that is points to token "union".

So I guess the questions are:
* Is this expected behavior? If yes, then what exactly getLocEnd() returns?
* How could I get the end location for case 4)?
* Is there a better way to get statement as token strings?
* As an alternative to previous question, can I somehow find the total
character length of statement in the original source code?


Thanks,
Sergejs



More information about the cfe-dev mailing list