[cfe-commits] r90860 - in /cfe/trunk: lib/Lex/Lexer.cpp test/Lexer/msdos-cpm-eof.c

Chris Lattner clattner at apple.com
Tue Dec 8 10:35:25 PST 2009


On Dec 8, 2009, at 8:38 AM, Steve Naroff wrote:

> Author: snaroff
> Date: Tue Dec  8 10:38:12 2009
> New Revision: 90860
>
> URL: http://llvm.org/viewvc/llvm-project?rev=90860&view=rev
> Log:
> Integrate the following from the 'objective-rewrite' branch:
>
> http://llvm.org/viewvc/llvm-project?view=rev&revision=80043

Hi Steve,

This patch isn't great for two reasons: 1) it makes caused the  
CharInfo array to be writable instead of read only, and 2) it gives us  
mutable state that isn't thread-safe (if we have two files compiling  
on separate threads, badness will happen).

I think a better approach would be to make a little class like this:

class LexerCharInfo {
   unsigned char *CharData;

   LexerCharInfo(const LangOptions &);

   bool isIdentifierBody(unsigned char c) const { ... }
   bool isHorizontalWhitespace(unsigned char c) const {... }
};

and have the Lexer object have one of these embedded into it.  At  
initialization time, if the Features.Microsoft bit isn't set, the  
pointer can point to the const array.  If it is set, the LexerCharInfo  
constructor can new[] an array, memcpy from the constant buffer, then  
chop it up as it sees fit.  Seem reasonable?

Also, a more minor issue, InitCharacterInfo should take LangOptions by  
const reference.

-Chris

>
>
> Added:
>    cfe/trunk/test/Lexer/msdos-cpm-eof.c
> Modified:
>    cfe/trunk/lib/Lex/Lexer.cpp
>
> Modified: cfe/trunk/lib/Lex/Lexer.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/Lexer.cpp?rev=90860&r1=90859&r2=90860&view=diff
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- cfe/trunk/lib/Lex/Lexer.cpp (original)
> +++ cfe/trunk/lib/Lex/Lexer.cpp Tue Dec  8 10:38:12 2009
> @@ -33,7 +33,7 @@
> #include <cctype>
> using namespace clang;
>
> -static void InitCharacterInfo();
> +static void InitCharacterInfo(LangOptions);
>
> // 
> = 
> = 
> = 
> ----------------------------------------------------------------------= 
> ==//
> // Token Class Implementation
> @@ -59,7 +59,7 @@
>
> void Lexer::InitLexer(const char *BufStart, const char *BufPtr,
>                       const char *BufEnd) {
> -  InitCharacterInfo();
> +  InitCharacterInfo(Features);
>
>   BufferStart = BufStart;
>   BufferPtr = BufPtr;
> @@ -253,7 +253,7 @@
>
> // Statically initialize CharInfo table based on ASCII character set
> // Reference: FreeBSD 7.2 /usr/share/misc/ascii
> -static const unsigned char CharInfo[256] =
> +static unsigned char CharInfo[256] =
> {
> // 0 NUL         1 SOH         2 STX         3 ETX
> // 4 EOT         5 ENQ         6 ACK         7 BEL
> @@ -321,7 +321,7 @@
>    0           , 0           , 0           , 0
> };
>
> -static void InitCharacterInfo() {
> +static void InitCharacterInfo(LangOptions Features) {
>   static bool isInited = false;
>   if (isInited) return;
>   // check the statically-initialized CharInfo table
> @@ -339,6 +339,11 @@
>   }
>   for (unsigned i = '0'; i <= '9'; ++i)
>     assert(CHAR_NUMBER == CharInfo[i]);
> +
> +  if (Features.Microsoft)
> +    // Hack to treat DOS & CP/M EOF (^Z) as horizontal whitespace.
> +    CharInfo[26/*sub*/] = CHAR_HORZ_WS;
> +
>   isInited = true;
> }
>
>
> Added: cfe/trunk/test/Lexer/msdos-cpm-eof.c
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Lexer/msdos-cpm-eof.c?rev=90860&view=auto
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- cfe/trunk/test/Lexer/msdos-cpm-eof.c (added)
> +++ cfe/trunk/test/Lexer/msdos-cpm-eof.c Tue Dec  8 10:38:12 2009
> @@ -0,0 +1,5 @@
> +// RUN: clang-cc -fsyntax-only -verify -fms-extensions %s
> +
> +int a;
> +
> +
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits





More information about the cfe-commits mailing list