[cfe-dev] Problem in locations

Wed Aug 12 09:24:04 PDT 2009

On Aug 12, 2009, at 8:04 AM, Abramo Bagnara wrote:

> If I try to compile the attached program I get:
>
> $ ~/llvm/Debug/bin/clang-cc -pedantic -std=c89 z.c
> z.c:2:9: warning: variable declaration in for loop is a C99-specific  
> feature
>  for ( \
>        ^
> 1 diagnostic generated.
>
> The token start is not indicated in the position of the "i" of "int"  
> but
> in the previous line and the token length is set to 5.
>
> Is it intentional or it is a bug?
>
> IMHO to have a leading \newline as part of the token confuses the
> diagnostic without benefits.
> int p() {
>  for ( \
> int i = 0; i < 10; ++i)
>    ;
>  return 0;
> }

That is perhaps not the best quality of implementation for the  
diagnostic, but it is intended.  You're hitting issues that are due to  
the phases of translation in C.  The first phase removes escaped  
newlines (which, as a gnu extension, can be followed by horizontal  
whitespace... urg) and trigraphs.  Because the lexer fully integrates  
the various phases of translation, a source location for a token  
returns the first byte of the file that is part of that token.  In  
this case, it is the escaped newline.

If you ask the preprocessor to get the 'spelling' for the token,  
you'll get 'int'.  If you want the location of the first n'th actual  
character "i" "n" "t" from the token, you'll need to use something  
like StringLiteralParser::getOffsetOfStringByte (but without the  
'escape' processing) to advance over escaped newlines and trigraphs.

-Chris