[cfe-dev] Problem with enums as bitfields with MSVC compiler

Ted Kremenek kremenek at apple.com
Fri Feb 22 17:06:13 PST 2008


Patch applied!  In the future, please keep lines limited to 80  
columns.  Thanks!

On Feb 23, 2008, at 12:58 AM, Argiris Kirtzidis wrote:

> Hi all,
>
> There's an issue with classes that contain enums as bitfields, for  
> example:
>
> enum TokenKind {
> tk_some_identifier
> };
>
> class Test
> {
> TokenKind Kind : 8;
>
> public:
> TokenKind getKind() const { return Kind; }
> void setKind(TokenKind K) { Kind = K; }
> };
>
>
> Test t;
> t.setKind(tk_some_identifier);
> TokenKind result = t.getKind();
>
>
> MSVC treats enums by default as signed types, and on the example  
> above, it will treat the "Kind" field as a signed char.
> Everything will work fine as long as  tk_some_identifier < 128. For   
> tk_some_identifier >= 128 the 'result' variable will be negative.
>
> Usually enums that are included as bitfields have few values so  
> there's no problem. However, a big mess is created with the
> tok::TokenKind enum (in "clang/include/clang/Lex/Token.h") which  
> take a lot more values that 128. This enum is included in class  
> clang::Token and
> class clang::IdentifierInfo as 8 bit field. The result is that, when  
> compiling with MSVC, the token identifiers with values over 128 are  
> not recognized at all,
> because when the parser checks the TokenKind field that Lexer  
> returned, it gets a negative value !
>
>
> In MSVC this would solve it:
>
> enum TokenKind : unsigned char {
> tk_some_identifier
> };
>
>
> But to avoid some ugly #define for this, you could change the  
> clang::Token and clang::IdentifierInfo in a way like this:
>
> class Test
> {
> unsigned Kind : 8;    // not TokenKind
>
> public:
> TokenKind getKind() const { return (TokenKind)Kind; }
> void setKind(TokenKind K) { Kind = K; }
> };
>
>
> I've attached a patch that does this.
> <enum-tokenkind-fix.zip>Index: include/clang/Basic/IdentifierTable.h
> ===================================================================
> --- include/clang/Basic/IdentifierTable.h	(revision 47480)
> +++ include/clang/Basic/IdentifierTable.h	(working copy)
> @@ -36,7 +36,8 @@
> /// variable or function name).  The preprocessor keeps this  
> information in a
> /// set, and all tok::identifier tokens have a pointer to one of  
> these.
> class IdentifierInfo {
> -  tok::TokenKind TokenID      : 8; // Front-end token ID or  
> tok::identifier.
> +  // DON'T make TokenID a 'tok::TokenKind'; MSVC will treat it as a  
> signed char and TokenKinds > 127 won't be handled correctly.
> +  unsigned TokenID            : 8; // Front-end token ID or  
> tok::identifier.
>   unsigned BuiltinID          : 9; // ID if this is a builtin  
> (__builtin_inf).
>   tok::ObjCKeywordKind ObjCID : 5; // ID for objc @ keyword like  
> @'protocol'.
>   bool HasMacro               : 1; // True if there is a #define for  
> this.
> @@ -79,7 +80,7 @@
>   /// get/setTokenID - If this is a source-language token (e.g.  
> 'for'), this API
>   /// can be used to cause the lexer to map identifiers to source- 
> language
>   /// tokens.
> -  tok::TokenKind getTokenID() const { return TokenID; }
> +  tok::TokenKind getTokenID() const { return  
> (tok::TokenKind)TokenID; }
>   void setTokenID(tok::TokenKind ID) { TokenID = ID; }
>
>   /// getPPKeywordID - Return the preprocessor keyword ID for this  
> identifier.
> Index: include/clang/Lex/Token.h
> ===================================================================
> --- include/clang/Lex/Token.h	(revision 47480)
> +++ include/clang/Lex/Token.h	(working copy)
> @@ -36,7 +36,7 @@
>
>   /// Kind - The actual flavor of token this is.
>   ///
> -  tok::TokenKind Kind : 8;
> +  unsigned Kind : 8;  // DON'T make Kind a 'tok::TokenKind'; MSVC  
> will treat it as a signed char and TokenKinds > 127 won't be handled  
> correctly.
>
>   /// Flags - Bits we track about this token, members of the  
> TokenFlags enum.
>   unsigned Flags : 8;
> @@ -50,7 +50,7 @@
>     NeedsCleaning = 0x08   // Contained an escaped newline or  
> trigraph.
>   };
>
> -  tok::TokenKind getKind() const { return Kind; }
> +  tok::TokenKind getKind() const { return (tok::TokenKind)Kind; }
>   void setKind(tok::TokenKind K) { Kind = K; }
>
>   /// is/isNot - Predicates to check if this token is a specific  
> kind, as in
> @@ -66,7 +66,7 @@
>   void setLocation(SourceLocation L) { Loc = L; }
>   void setLength(unsigned Len) { Length = Len; }
>
> -  const char *getName() const { return tok::getTokenName(Kind); }
> +  const char *getName() const { return  
> tok::getTokenName((tok::TokenKind)Kind); }
>
>   /// startToken - Reset all flags to cleared.
>   ///
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev




More information about the cfe-dev mailing list