[cfe-dev] Bug: Lexer::getLocForEndOfToken() returns a position too far for a token which include backslash-newline pairs

John McCall rjmccall at apple.com
Mon Apr 4 10:27:51 PDT 2011


On Apr 4, 2011, at 8:27 AM, Marcin Kowalczyk wrote:
> My setup is complicated but I think I nailed down the problem here.
> Hopefully you will be able to reproduce it.
> 
> Given a token with embedded backslash-newline pairs, such that:
>  "foo\
>  bar\
>  baz"
> Lexer::getLocForEndOfToken() returns a location which is too far by
> the number of backslash-newline pairs times 2, as if these characters
> were counted twice. Looking at the implementation:
> 
> SourceLocation Lexer::getLocForEndOfToken(SourceLocation Loc, unsigned Offset,
>                                          const SourceManager &SM,
>                                          const LangOptions &Features) {
>  if (Loc.isInvalid() || !Loc.isFileID())
>    return SourceLocation();
> 
>  unsigned Len = Lexer::MeasureTokenLength(Loc, SM, Features);
>  printf("Token length: %d\n", Len);
>  if (Len > Offset)
>    Len = Len - Offset;
>  else
>    return Loc;
> 
>  return AdvanceToTokenCharacter(Loc, Len, SM, Features);
> }
> 
> I guess that MeasureTokenLength() includes any backslash-newline
> pairs, but AdvanceToTokenCharacter() skips them. I'm not sure in which
> direction this should be fixed.

I *think* the right solution here is for getLocForEndOfToken to just use
getFileLocWithOffset instead of AdvanceToTokenCharacter.  Would
you mind writing that up and testing it?

John.



More information about the cfe-dev mailing list