[cfe-commits] r147091 - in /cfe/trunk: lib/Lex/Lexer.cpp test/Lexer/escape_newline.c

Wed Dec 21 20:45:01 PST 2011

On Dec 21, 2011, at 2:51 PM, David Blaikie wrote:

> I'll try this myself when I get a chance but I think if you adds other
> line at the end of your test file you'll see the crash still because
> the spelling loops will jump past their end pointer, becoming unbounded.

Good catch!
See r147138, I made a change to avoid consuming the second newline. This actually matches how the size of the comment token was determined in the first place.

-Argyrios

> 
> You could change those loops to use < rather than != and then perhaps
> if you move things in those loops around a bit you could just not add
> the char to the buffer if such a condition occurred. This would fix the
> spelling to be exact/correct too, though I think it would leave these
> functions with somewhat subtle semantics. Clear warning might suffice
> given there's only 3 callers or something - though I'd probably err
> towards a less surprising api
> From: Argyrios Kyrtzidis
> Sent: 12/21/2011 12:38 PM
> To: David Blaikie
> Cc: cfe-commits at cs.uiuc.edu
> Subject: Re: [cfe-commits] r147091 - in /cfe/trunk: lib/Lex/Lexer.cpp
> test/Lexer/escape_newline.c
> On Dec 21, 2011, at 2:19 PM, David Blaikie wrote:
> 
>> I suspect this isn't the right fix, but I'd have to test it further. If
>> the newline following the curious comment isn't the last character in
>> the file, this would still cause the original crash, wouldn't it?
>> (since this only detects the end of file case, but still might walk off
>> the end of the token we're currently spelling, which will cause thr
>> speller to jump past its end pointer as it was before)
> 
> If the new line is not the last character it won't walk off the end
> because getCharAndSizeSlow will continue returning the characters one
> by one.
> 
>> 
>> Also, even in the specific case this appears to remove the crash, it
>> does so by spelling the token incorrectly, doesn't it?
>> 
>> // \
>> x
>> 
>> Is spelled '// x' but
>> 
>> // \
>> 
>> 
>> Is spelled as '// \' which seems inconsistent.
> 
> That's true; how about if it returned a space instead of '\' ?
> Ideally spelling would halt before '\' but I think having it return a
> space is a reasonable and simpler fix.
> 
> -Argyrios
> 
>> 
>> I've thought about this bug a bi since I discovered it and haven't
>> found a 'nice' fix yet. Possibly using the size as an in/out parameter
>> (or adding an in parameter) to describe the size of the token being
>> read - and testing it after the call (so as not to add the returned
>> character if there was none available)... Or passing the pointer as
>> in/out and having this function increment it directly, then test != end
>> before appending
>> From: Argyrios Kyrtzidis
>> Sent: 12/21/2011 10:25 AM
>> To: cfe-commits at cs.uiuc.edu
>> Subject: [cfe-commits] r147091 - in /cfe/trunk: lib/Lex/Lexer.cpp
>> test/Lexer/escape_newline.c
>> Author: akirtzidis
>> Date: Wed Dec 21 14:19:55 2011
>> New Revision: 147091
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=147091&view=rev
>> Log:
>> In Lexer::getCharAndSizeSlow[NoWarn] make sure we don't go over the
>> end of the buffer
>> when the end of the buffer is immediately after an escaped newline.
>> 
>> Fixes http://llvm.org/PR10153.
>> 
>> Modified:
>>   cfe/trunk/lib/Lex/Lexer.cpp
>>   cfe/trunk/test/Lexer/escape_newline.c
>> 
>> Modified: cfe/trunk/lib/Lex/Lexer.cpp
>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/Lexer.cpp?rev=147091&r1=147090&r2=147091&view=diff
>> ==============================================================================
>> --- cfe/trunk/lib/Lex/Lexer.cpp (original)
>> +++ cfe/trunk/lib/Lex/Lexer.cpp Wed Dec 21 14:19:55 2011
>> @@ -1171,6 +1171,10 @@
>>      // Found backslash<whitespace><newline>.  Parse the char after it.
>>      Size += EscapedNewLineSize;
>>      Ptr  += EscapedNewLineSize;
>> +
>> +      if (Ptr[0] == '\0')
>> +        return '\\';
>> +
>>      // Use slow version to accumulate a correct size field.
>>      return getCharAndSizeSlow(Ptr, Size, Tok);
>>    }
>> @@ -1222,6 +1226,9 @@
>>      Size += EscapedNewLineSize;
>>      Ptr  += EscapedNewLineSize;
>> 
>> +      if (Ptr[0] == '\0')
>> +        return '\\';
>> +
>>      // Use slow version to accumulate a correct size field.
>>      return getCharAndSizeSlowNoWarn(Ptr, Size, Features);
>>    }
>> 
>> Modified: cfe/trunk/test/Lexer/escape_newline.c
>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Lexer/escape_newline.c?rev=147091&r1=147090&r2=147091&view=diff
>> ==============================================================================
>> --- cfe/trunk/test/Lexer/escape_newline.c (original)
>> +++ cfe/trunk/test/Lexer/escape_newline.c Wed Dec 21 14:19:55 2011
>> @@ -1,7 +1,10 @@
>> // RUN: %clang_cc1 -E -trigraphs %s | grep -- ' ->'
>> // RUN: %clang_cc1 -E -trigraphs %s 2>&1 | grep 'backslash and
>> newline separated by space'
>> // RUN: %clang_cc1 -E -trigraphs %s 2>&1 | grep 'trigraph converted'
>> +// RUN: %clang_cc1 -E -CC -trigraphs %s
>> 
>> // This is an ugly way to spell a -> token.
>> -??/
>>> 
>> +
>> +// \
>> 
>> 
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits