[cfe-commits] r68076 - in /cfe/trunk:	include/clang/Basic/DiagnosticLexKinds.td	lib/Lex/LiteralSupport.cpp test/Sema/ucn-cstring.c
    Eli Friedman 
    eli.friedman at gmail.com
       
    Mon Mar 30 17:09:39 PDT 2009
    
    
  
On Mon, Mar 30, 2009 at 4:46 PM, Steve Naroff <snaroff at apple.com> wrote:
> +  // Check UCN constraints (C99 6.4.3p2)
> +  if ((UcnVal < 0xa0 &&
> +      (UcnVal != 0x24 && UcnVal != 0x40 && UcnVal != 0x60 )) // $, @, `
> +      || (UcnVal >= 0xD800 && UcnVal <= 0xDFFF)) {
> +    PP.Diag(Loc, diag::err_ucn_escape_invalid);
> +    HadError = 1;
> +    return;
> +  }
This should also check that UcnVal is less that 0x110000.
> +  // Now that we've parsed/checked the UCN, we convert from UTF32->UTF8.
> +  // The conversion below was inspired by:
> +  //   http://www.unicode.org/Public/PROGRAMS/CVTUTF/ConvertUTF.c
> +  // First, we determine how many bytes the result will require.
> +  typedef unsigned char UTF8;
> +
> +  unsigned short bytesToWrite = 0;
> +  if (UcnVal < (UTF32)0x80)
> +    bytesToWrite = 1;
> +  else if (UcnVal < (UTF32)0x800)
> +    bytesToWrite = 2;
> +  else if (UcnVal < (UTF32)0x10000)
> +    bytesToWrite = 3;
> +  else
> +    bytesToWrite = 4;
> +
> +  // If the buffer isn't big enough, bail.
> +  if ((ResultBuf + bytesToWrite) >= ResultBufEnd) {
> +    PP.Diag(Loc, diag::err_ucn_escape_too_big);
> +    HadError = 1;
> +    return;
> +  }
This should be impossible: the largest UTF-8 character is 4 bytes, and
the smallest escape is 6 bytes.
> +  static const UTF8 firstByteMark[7] = {
> +    0x00, 0x00, 0xC0, 0xE0, 0xF0, 0xF8, 0xFC
> +  };
The last 3 elements of this array are unused.
> +  printf("%s (%d)\n", "hello \u2192 \u2603 \u2190 world", sizeof("hello \u2192 \u2603 \u2190 world"));
> +  printf("%s (%d)\n", "\U00010400\U0001D12B", sizeof("\U00010400\U0001D12B"));
It would be nice to add a compile-time test, like "int a[sizeof("hello
\u2192 \u2603 \u2190 world") == 24 ? 1 : -1];".
I'm assuming you're aware this doesn't handle wide strings correctly...
-Eli
    
    
More information about the cfe-commits
mailing list