[cfe-commits] r68076 - in /cfe/trunk: include/clang/Basic/DiagnosticLexKinds.td lib/Lex/LiteralSupport.cpp test/Sema/ucn-cstring.c

steve naroff snaroff at apple.com
Tue Mar 31 03:33:15 PDT 2009


Eli,

Thanks for your quick review!

I made the changes in http://llvm.org/viewvc/llvm-project?rev=68107&view=rev 
.

Only two minor comments below...

On Mar 30, 2009, at 8:09 PM, Eli Friedman wrote:

> On Mon, Mar 30, 2009 at 4:46 PM, Steve Naroff <snaroff at apple.com>  
> wrote:
>> +  // Check UCN constraints (C99 6.4.3p2)
>> +  if ((UcnVal < 0xa0 &&
>> +      (UcnVal != 0x24 && UcnVal != 0x40 && UcnVal != 0x60 )) // $,  
>> @, `
>> +      || (UcnVal >= 0xD800 && UcnVal <= 0xDFFF)) {
>> +    PP.Diag(Loc, diag::err_ucn_escape_invalid);
>> +    HadError = 1;
>> +    return;
>> +  }
>
> This should also check that UcnVal is less that 0x110000.
>
>> +  // Now that we've parsed/checked the UCN, we convert from UTF32- 
>> >UTF8.
>> +  // The conversion below was inspired by:
>> +  //   http://www.unicode.org/Public/PROGRAMS/CVTUTF/ConvertUTF.c
>> +  // First, we determine how many bytes the result will require.
>> +  typedef unsigned char UTF8;
>> +
>> +  unsigned short bytesToWrite = 0;
>> +  if (UcnVal < (UTF32)0x80)
>> +    bytesToWrite = 1;
>> +  else if (UcnVal < (UTF32)0x800)
>> +    bytesToWrite = 2;
>> +  else if (UcnVal < (UTF32)0x10000)
>> +    bytesToWrite = 3;
>> +  else
>> +    bytesToWrite = 4;
>> +
>> +  // If the buffer isn't big enough, bail.
>> +  if ((ResultBuf + bytesToWrite) >= ResultBufEnd) {
>> +    PP.Diag(Loc, diag::err_ucn_escape_too_big);
>> +    HadError = 1;
>> +    return;
>> +  }
>
> This should be impossible: the largest UTF-8 character is 4 bytes, and
> the smallest escape is 6 bytes.
>
>> +  static const UTF8 firstByteMark[7] = {
>> +    0x00, 0x00, 0xC0, 0xE0, 0xF0, 0xF8, 0xFC
>> +  };
>
> The last 3 elements of this array are unused.

I believe only the last 2 elements are unused.

>
>
>> +  printf("%s (%d)\n", "hello \u2192 \u2603 \u2190 world",  
>> sizeof("hello \u2192 \u2603 \u2190 world"));
>> +  printf("%s (%d)\n", "\U00010400\U0001D12B",  
>> sizeof("\U00010400\U0001D12B"));
>
> It would be nice to add a compile-time test, like "int a[sizeof("hello
> \u2192 \u2603 \u2190 world") == 24 ? 1 : -1];".
>
> I'm assuming you're aware this doesn't handle wide strings  
> correctly...

Yes. I added a FIXME for now...

snaroff

>
>
> -Eli

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20090331/d5276344/attachment.html>


More information about the cfe-commits mailing list