[cfe-commits] r147091 - in /cfe/trunk: lib/Lex/Lexer.cpp test/Lexer/escape_newline.c

Wed Dec 21 14:51:21 PST 2011

I'll try this myself when I get a chance but I think if you adds other
line at the end of your test file you'll see the crash still because
the spelling loops will jump past their end pointer, becoming unbounded.

You could change those loops to use < rather than != and then perhaps
if you move things in those loops around a bit you could just not add
the char to the buffer if such a condition occurred. This would fix the
spelling to be exact/correct too, though I think it would leave these
functions with somewhat subtle semantics. Clear warning might suffice
given there's only 3 callers or something - though I'd probably err
towards a less surprising api
From: Argyrios Kyrtzidis
Sent: 12/21/2011 12:38 PM
To: David Blaikie
Cc: cfe-commits at cs.uiuc.edu
Subject: Re: [cfe-commits] r147091 - in /cfe/trunk: lib/Lex/Lexer.cpp
test/Lexer/escape_newline.c
On Dec 21, 2011, at 2:19 PM, David Blaikie wrote:

> I suspect this isn't the right fix, but I'd have to test it further. If
> the newline following the curious comment isn't the last character in
> the file, this would still cause the original crash, wouldn't it?
> (since this only detects the end of file case, but still might walk off
> the end of the token we're currently spelling, which will cause thr
> speller to jump past its end pointer as it was before)

If the new line is not the last character it won't walk off the end
because getCharAndSizeSlow will continue returning the characters one
by one.

>
> Also, even in the specific case this appears to remove the crash, it
> does so by spelling the token incorrectly, doesn't it?
>
> // \
> x
>
> Is spelled '// x' but
>
> // \
>
>
> Is spelled as '// \' which seems inconsistent.

That's true; how about if it returned a space instead of '\' ?
Ideally spelling would halt before '\' but I think having it return a
space is a reasonable and simpler fix.

-Argyrios

>
> I've thought about this bug a bi since I discovered it and haven't
> found a 'nice' fix yet. Possibly using the size as an in/out parameter
> (or adding an in parameter) to describe the size of the token being
> read - and testing it after the call (so as not to add the returned
> character if there was none available)... Or passing the pointer as
> in/out and having this function increment it directly, then test != end
> before appending
> From: Argyrios Kyrtzidis
> Sent: 12/21/2011 10:25 AM
> To: cfe-commits at cs.uiuc.edu
> Subject: [cfe-commits] r147091 - in /cfe/trunk: lib/Lex/Lexer.cpp
> test/Lexer/escape_newline.c
> Author: akirtzidis
> Date: Wed Dec 21 14:19:55 2011
> New Revision: 147091
>
> URL: http://llvm.org/viewvc/llvm-project?rev=147091&view=rev
> Log:
> In Lexer::getCharAndSizeSlow[NoWarn] make sure we don't go over the
> end of the buffer
> when the end of the buffer is immediately after an escaped newline.
>
> Fixes http://llvm.org/PR10153.
>
> Modified:
>    cfe/trunk/lib/Lex/Lexer.cpp
>    cfe/trunk/test/Lexer/escape_newline.c
>
> Modified: cfe/trunk/lib/Lex/Lexer.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/Lexer.cpp?rev=147091&r1=147090&r2=147091&view=diff
> ==============================================================================
> --- cfe/trunk/lib/Lex/Lexer.cpp (original)
> +++ cfe/trunk/lib/Lex/Lexer.cpp Wed Dec 21 14:19:55 2011
> @@ -1171,6 +1171,10 @@
>       // Found backslash<whitespace><newline>.  Parse the char after it.
>       Size += EscapedNewLineSize;
>       Ptr  += EscapedNewLineSize;
> +
> +      if (Ptr[0] == '\0')
> +        return '\\';
> +
>       // Use slow version to accumulate a correct size field.
>       return getCharAndSizeSlow(Ptr, Size, Tok);
>     }
> @@ -1222,6 +1226,9 @@
>       Size += EscapedNewLineSize;
>       Ptr  += EscapedNewLineSize;
>
> +      if (Ptr[0] == '\0')
> +        return '\\';
> +
>       // Use slow version to accumulate a correct size field.
>       return getCharAndSizeSlowNoWarn(Ptr, Size, Features);
>     }
>
> Modified: cfe/trunk/test/Lexer/escape_newline.c
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Lexer/escape_newline.c?rev=147091&r1=147090&r2=147091&view=diff
> ==============================================================================
> --- cfe/trunk/test/Lexer/escape_newline.c (original)
> +++ cfe/trunk/test/Lexer/escape_newline.c Wed Dec 21 14:19:55 2011
> @@ -1,7 +1,10 @@
> // RUN: %clang_cc1 -E -trigraphs %s | grep -- ' ->'
> // RUN: %clang_cc1 -E -trigraphs %s 2>&1 | grep 'backslash and
> newline separated by space'
> // RUN: %clang_cc1 -E -trigraphs %s 2>&1 | grep 'trigraph converted'
> +// RUN: %clang_cc1 -E -CC -trigraphs %s
>
> // This is an ugly way to spell a -> token.
>  -??/
>>
> +
> +// \
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits