[cfe-dev] [PATCH] C++0x unicode string and character literals now with test cases

Tue Jul 26 23:07:49 PDT 2011

Here's the patch to update the support page.

On Tue, Jul 26, 2011 at 10:41 PM, Douglas Gregor <dgregor at apple.com> wrote:
>
> On Jul 26, 2011, at 8:39 AM, Craig Topper wrote:
>
>> Here's a new patch addressing Chris' suggestions.
>
> Great! I've committed this as r136210, thanks so much!
>
>        - Doug
>
>>
>> On Mon, Jul 25, 2011 at 11:14 PM, Chris Lattner <clattner at apple.com> wrote:
>>>
>>> On Jul 25, 2011, at 10:52 PM, Craig Topper wrote:
>>>
>>>> Doh! I accidentally warned on UCNs larger than 0xffff in UTF-32
>>>> character literals. This patch fixes that.
>>>
>>> Hi Craig,
>>>
>>> Doug is definitely the best one to review the semantics of this.  Some minor comments:
>>>
>>>   IdentifierInfo &get(StringRef Name, tok::TokenKind TokenCode) {
>>>     IdentifierInfo &II = get(Name);
>>> +    assert(TokenCode < 512 && "TokenCode too large");
>>>     II.TokenID = TokenCode;
>>>     return II;
>>>
>>> I would suggest instead:
>>>
>>>   IdentifierInfo &get(StringRef Name, tok::TokenKind TokenCode) {
>>>     IdentifierInfo &II = get(Name);
>>>     II.TokenID = TokenCode;
>>> +    assert(II.TokenID == TokenCode && "TokenCode too large");
>>>     return II;
>>>
>>> to avoid tying the '9' bit bitfield size to the magic 512 constant.
>>>
>>>
>>>  class StringLiteral : public Expr {
>>> ...
>>>   unsigned ByteLength;
>>> -  bool IsWide;
>>> +  StringKind Kind;
>>>   bool IsPascal;
>>>   unsigned NumConcatenated;
>>>
>>> sizeof(StringKind) is likely to be 4, wasting space.  I'd suggest making it an 8 bit bitfield or something.
>>>
>>>
>>> In SemaDeclAttr etc:
>>>
>>> -    if (Str == 0 || Str->isWide()) {
>>> +    if (Str == 0 || Str->getKind() != StringLiteral::Ascii) {
>>>       S.Diag(Attr.getLoc(), diag::err_attribute_argument_n_not_string)
>>>           << "weakref" << 1;
>>>
>>> I'd suggest introducing the proper helper methods and using:
>>>
>>> -    if (Str == 0 || Str->isWide()) {
>>> +    if (Str == 0 || !Str->isAscii()) {
>>>       S.Diag(Attr.getLoc(), diag::err_attribute_argument_n_not_string)
>>>           << "weakref" << 1;
>>>
>>> In Lexer.cpp:
>>>
>>> +    // treat U like the start of an identifier.
>>> +    goto StartIdentifier;
>>>
>>> Instead of doing this, I'd replace the gotos with just "return LexIdentifier(Result, CurPtr);" since "MIOpt.ReadToken" has already been done.
>>>
>>> I would suggest fusing IsIdentifierL and IsIdentifierUTFStringPrefix into one function.
>>>
>>> Otherwise, LGTM!
>>>
>>> -Chris
>>>
>>>
>>>
>> <unicode_strings.patch>
>
>

-- 
~Craig
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unicode_support.patch
Type: application/octet-stream
Size: 471 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20110726/a8a5f959/attachment.obj>