[cfe-dev] Bug: Lexer::getLocForEndOfToken() returns a position too far for a token which include backslash-newline pairs

Marcin Kowalczyk qrczak at google.com
Mon Apr 4 08:27:46 PDT 2011


My setup is complicated but I think I nailed down the problem here.
Hopefully you will be able to reproduce it.

Given a token with embedded backslash-newline pairs, such that:
  "foo\
  bar\
  baz"
Lexer::getLocForEndOfToken() returns a location which is too far by
the number of backslash-newline pairs times 2, as if these characters
were counted twice. Looking at the implementation:

SourceLocation Lexer::getLocForEndOfToken(SourceLocation Loc, unsigned Offset,
                                          const SourceManager &SM,
                                          const LangOptions &Features) {
  if (Loc.isInvalid() || !Loc.isFileID())
    return SourceLocation();

  unsigned Len = Lexer::MeasureTokenLength(Loc, SM, Features);
  printf("Token length: %d\n", Len);
  if (Len > Offset)
    Len = Len - Offset;
  else
    return Loc;

  return AdvanceToTokenCharacter(Loc, Len, SM, Features);
}

I guess that MeasureTokenLength() includes any backslash-newline
pairs, but AdvanceToTokenCharacter() skips them. I'm not sure in which
direction this should be fixed.

I'm not subscribed to the list.

-- 
Marcin Kowalczyk



More information about the cfe-dev mailing list