[cfe-dev] [PATCH] C++0x unicode string and character literals now with test cases
Douglas Gregor
dgregor at apple.com
Wed Jul 27 08:22:49 PDT 2011
On Jul 26, 2011, at 11:07 PM, Craig Topper wrote:
> Here's the patch to update the support page.
r136216, thanks!
- Doug
> On Tue, Jul 26, 2011 at 10:41 PM, Douglas Gregor <dgregor at apple.com> wrote:
>>
>> On Jul 26, 2011, at 8:39 AM, Craig Topper wrote:
>>
>>> Here's a new patch addressing Chris' suggestions.
>>
>> Great! I've committed this as r136210, thanks so much!
>>
>> - Doug
>>
>>>
>>> On Mon, Jul 25, 2011 at 11:14 PM, Chris Lattner <clattner at apple.com> wrote:
>>>>
>>>> On Jul 25, 2011, at 10:52 PM, Craig Topper wrote:
>>>>
>>>>> Doh! I accidentally warned on UCNs larger than 0xffff in UTF-32
>>>>> character literals. This patch fixes that.
>>>>
>>>> Hi Craig,
>>>>
>>>> Doug is definitely the best one to review the semantics of this. Some minor comments:
>>>>
>>>> IdentifierInfo &get(StringRef Name, tok::TokenKind TokenCode) {
>>>> IdentifierInfo &II = get(Name);
>>>> + assert(TokenCode < 512 && "TokenCode too large");
>>>> II.TokenID = TokenCode;
>>>> return II;
>>>>
>>>> I would suggest instead:
>>>>
>>>> IdentifierInfo &get(StringRef Name, tok::TokenKind TokenCode) {
>>>> IdentifierInfo &II = get(Name);
>>>> II.TokenID = TokenCode;
>>>> + assert(II.TokenID == TokenCode && "TokenCode too large");
>>>> return II;
>>>>
>>>> to avoid tying the '9' bit bitfield size to the magic 512 constant.
>>>>
>>>>
>>>> class StringLiteral : public Expr {
>>>> ...
>>>> unsigned ByteLength;
>>>> - bool IsWide;
>>>> + StringKind Kind;
>>>> bool IsPascal;
>>>> unsigned NumConcatenated;
>>>>
>>>> sizeof(StringKind) is likely to be 4, wasting space. I'd suggest making it an 8 bit bitfield or something.
>>>>
>>>>
>>>> In SemaDeclAttr etc:
>>>>
>>>> - if (Str == 0 || Str->isWide()) {
>>>> + if (Str == 0 || Str->getKind() != StringLiteral::Ascii) {
>>>> S.Diag(Attr.getLoc(), diag::err_attribute_argument_n_not_string)
>>>> << "weakref" << 1;
>>>>
>>>> I'd suggest introducing the proper helper methods and using:
>>>>
>>>> - if (Str == 0 || Str->isWide()) {
>>>> + if (Str == 0 || !Str->isAscii()) {
>>>> S.Diag(Attr.getLoc(), diag::err_attribute_argument_n_not_string)
>>>> << "weakref" << 1;
>>>>
>>>> In Lexer.cpp:
>>>>
>>>> + // treat U like the start of an identifier.
>>>> + goto StartIdentifier;
>>>>
>>>> Instead of doing this, I'd replace the gotos with just "return LexIdentifier(Result, CurPtr);" since "MIOpt.ReadToken" has already been done.
>>>>
>>>> I would suggest fusing IsIdentifierL and IsIdentifierUTFStringPrefix into one function.
>>>>
>>>> Otherwise, LGTM!
>>>>
>>>> -Chris
>>>>
>>>>
>>>>
>>> <unicode_strings.patch>
>>
>>
>
>
>
> --
> ~Craig
> <unicode_support.patch>
More information about the cfe-dev
mailing list