[cfe-dev] confusing getLocEnd() behavior

Sergejs Belajevs sergejs.belajevs at gmail.com
Thu Jul 12 10:57:20 PDT 2012


Hi,

a quick example for Visual Studio 2010: https://gist.github.com/3099625

file's a.cpp contents:

void foo()
{
   union { int a; };
}

When compiled with libs from clang 3.1, I have the following output:

top-level-decl: __builtin_va_list
top-level-decl: foo
(CompoundStmt 0x5ff8f8
  (DeclStmt 0x5ff8e8
    0x5ff690 "<anonymous union at a.cpp:3:4> =
      (CXXConstructExpr 0x5ff888 'union <anonymous at a.cpp:3:4>''void
(void) throw()')"))
a.cpp:2:1 <=> a.cpp:4:1
(DeclStmt 0x5ff8e8
  0x5ff690 "<anonymous union at a.cpp:3:4> =
    (CXXConstructExpr 0x5ff888 'union <anonymous at a.cpp:3:4>''void
(void) throw()')")
a.cpp:3:4 <=> <invalid loc>

As you can see, location for last DeclStmt is "a.cpp:3:4 <=> <invalid
loc>", that is getLocEnd() returned not what I was expecting.


Sergejs

On Thu, Jul 12, 2012 at 2:37 AM, Daniel Jasper <djasper at google.com> wrote:
> Could you post a bit more context (a minimal example) of what you are
> precisely looking at? I could not reproduce the error for the cases 1-4. If
> I put them outside of a method, they don't lead to a DeclStmt, but a
> CXXRecordDec. If I put them into a method, e.g.:
>
> void f() {
>   union { int a; };
> }
>
> I get the DeclStmt but it seems to have the right code range (column 2 to
> 19).
>
> In general, the code location handling is quite inconsistent and we are
> currently looking into how to improve this.
>
>
> On Thu, Jul 12, 2012 at 6:45 AM, Sergejs Belajevs
> <sergejs.belajevs at gmail.com> wrote:
>>
>> Hi,
>>
>> I am working on source-to-source transformation tool and want to get
>> the original source code for a statement token by token. I am using
>> statement's getLocStart/getLocEnd, SourceLocation's getLocWithOffset,
>> SourceManager's getCharacterData and Lexer::MeasureTokenLength. My
>> code worked fine until I ran into some DeclStmts:
>>
>> 1) struct A { int a; } s;
>> 2) struct A { int a; };
>> 3) union A { int a; };
>> 4) union { int a; };
>>
>> For 1) getLocEnd() works fine.
>> For 2) my code doesn't work because getLocEnd() is smaller than
>> getLocStart(). End's getRawEncoding() returns 0. I found a workaround
>> for this case by calling getLocEnd() of DeclStmt's getSingleDecl().
>> Case 3) has the same problem as 2), same workaround works fine.
>> Case 4) has the same problem as 2), but this time after applying my
>> workaround the resulting SourceLocation is the same as getLocStart(),
>> that is points to token "union".
>>
>> So I guess the questions are:
>> * Is this expected behavior? If yes, then what exactly getLocEnd()
>> returns?
>> * How could I get the end location for case 4)?
>> * Is there a better way to get statement as token strings?
>> * As an alternative to previous question, can I somehow find the total
>> character length of statement in the original source code?
>>
>>
>> Thanks,
>> Sergejs
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>



More information about the cfe-dev mailing list