[cfe-dev] [PATCH] C++0x unicode string and character literals now with test cases
Chris Lattner
clattner at apple.com
Mon Jul 25 23:14:03 PDT 2011
On Jul 25, 2011, at 10:52 PM, Craig Topper wrote:
> Doh! I accidentally warned on UCNs larger than 0xffff in UTF-32
> character literals. This patch fixes that.
Hi Craig,
Doug is definitely the best one to review the semantics of this. Some minor comments:
IdentifierInfo &get(StringRef Name, tok::TokenKind TokenCode) {
IdentifierInfo &II = get(Name);
+ assert(TokenCode < 512 && "TokenCode too large");
II.TokenID = TokenCode;
return II;
I would suggest instead:
IdentifierInfo &get(StringRef Name, tok::TokenKind TokenCode) {
IdentifierInfo &II = get(Name);
II.TokenID = TokenCode;
+ assert(II.TokenID == TokenCode && "TokenCode too large");
return II;
to avoid tying the '9' bit bitfield size to the magic 512 constant.
class StringLiteral : public Expr {
...
unsigned ByteLength;
- bool IsWide;
+ StringKind Kind;
bool IsPascal;
unsigned NumConcatenated;
sizeof(StringKind) is likely to be 4, wasting space. I'd suggest making it an 8 bit bitfield or something.
In SemaDeclAttr etc:
- if (Str == 0 || Str->isWide()) {
+ if (Str == 0 || Str->getKind() != StringLiteral::Ascii) {
S.Diag(Attr.getLoc(), diag::err_attribute_argument_n_not_string)
<< "weakref" << 1;
I'd suggest introducing the proper helper methods and using:
- if (Str == 0 || Str->isWide()) {
+ if (Str == 0 || !Str->isAscii()) {
S.Diag(Attr.getLoc(), diag::err_attribute_argument_n_not_string)
<< "weakref" << 1;
In Lexer.cpp:
+ // treat U like the start of an identifier.
+ goto StartIdentifier;
Instead of doing this, I'd replace the gotos with just "return LexIdentifier(Result, CurPtr);" since "MIOpt.ReadToken" has already been done.
I would suggest fusing IsIdentifierL and IsIdentifierUTFStringPrefix into one function.
Otherwise, LGTM!
-Chris
More information about the cfe-dev
mailing list