[cfe-commits] r115743 - in /cfe/trunk: include/clang/Basic/DiagnosticLexKinds.td lib/Lex/LiteralSupport.cpp test/CodeGen/string-literal-short-wstring.c test/CodeGen/string-literal.c test/Lexer/c90.c test/Lexer/wchar.c
Nico Weber
thakis at chromium.org
Thu Oct 7 10:51:39 PDT 2010
Hi,
On Thu, Oct 7, 2010 at 8:41 AM, NAKAMURA Takumi <geek4civic at gmail.com> wrote:
> Konbanwa, Nico.
>
> I think it would be better below eg. ;
>
> // RUN: %clang_cc1 -triple i686-pc-linux -emit-llvm %s -o - |
> FileCheck %s -check-prefix=W32
> // RUN: %clang_cc1 -triple i686-pc-win32 -emit-llvm %s -o - |
> FileCheck %s -check-prefix=W16
>
> // W32: private constant [12 x i8] c"A\00\00\00B\00\00\00\00\00\00\00"
> // W16: private constant [6 x i8] c"A\00B\00\00\00"
>
> Please see also http://llvm.org/docs/TestingGuide.html#FileCheck
I will look into this, thanks for the pointer.
> It would be best if clang has the option wchar_t size, not to depend on triplet.
>
> ps. it seems buggy to encode utf16(\u) and utf32(\U) on "L" literal
> for W16 environment.
\u UCNs are always <= 0xffff, so their utf16 is the same as the
character itself. What does MSVC do for UCNs > 0xffff in L literal
strings? gcc converts to utf16 if -fshort-wchar is used, so I'm
emulating this for now. If you think it's better to do something else
on windows, let me know what.
Thanks,
Nico
>
>
> ...Takumi
>
> 2010/10/7 Nico Weber <thakis at chromium.org>:
>> On Thu, Oct 7, 2010 at 6:42 AM, NAKAMURA Takumi <geek4civic at gmail.com> wrote:
>>> Francois,
>>>
>>> I guess it would be due to wchar_t is 16 bit on win32.
>>> I saw the same result on mingw.
>>
>> That'll be my guess, too. D'oh!
>>
>>> It passes to add -triple i686-linux for cc1 driver.
>>
>> What's the usual strategy to test platform-dependent codegen stuff? Is
>> adding a triple to the CHECK line ok?
>>
>> Nico
>>
>>>
>>> ...Takumi
>>>
>>> 2010/10/7 Francois Pichet <pichet2000 at gmail.com>:
>>>> I don't know why but the test fails on Windows:
>>>> any idea?
>>>>
>>>> 1>Command 1: "FileCheck"
>>>> "D:\Dev\llvm\llvm_trunk\tools\clang\test\CodeGen\string-literal.c"
>>>> 1>Command 1 Result: 1
>>>> 1>Command 1 Output:
>>>> 1>Command 1 Stderr:
>>>> 1>D:\Dev\llvm\llvm_trunk\tools\clang\test\CodeGen\string-literal.c:11:12:
>>>> error: expected string not found in input
>>>> 1> // CHECK: private constant [12 x i8] c"A\00\00\00B\00\00\00\00\00\00\00"
>>>> 1> ^
>>>> 1><stdin>:7:1: note: scanning from here
>>>> 1>@.str = private constant [6 x i8] c"A\00B\00\00\00"
>>>> 1>^
>>>> 1><stdin>:8:10: note: possible intended match here
>>>> 1>@.str1 = private constant [10 x i8] c"4\12\00\00\0B\F0\10\00\00\00"
>>>> 1> ^
>>>> 1>--
>>>>
>>>> On Wed, Oct 6, 2010 at 12:57 AM, Nico Weber <nicolasweber at gmx.de> wrote:
>>>>> Author: nico
>>>>> Date: Tue Oct 5 23:57:26 2010
>>>>> New Revision: 115743
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=115743&view=rev
>>>>> Log:
>>>>> Add support for 4-byte UCNs like \U12345678. Warn about UCNs in c90 mode.
>>>>>
>>>>> Added:
>>>>> cfe/trunk/test/CodeGen/string-literal-short-wstring.c
>>>>> cfe/trunk/test/Lexer/wchar.c
>>>>> Modified:
>>>>> cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td
>>>>> cfe/trunk/lib/Lex/LiteralSupport.cpp
>>>>> cfe/trunk/test/CodeGen/string-literal.c
>>>>> cfe/trunk/test/Lexer/c90.c
>>>>>
>>>>> Modified: cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td
>>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td?rev=115743&r1=115742&r2=115743&view=diff
>>>>> ==============================================================================
>>>>> --- cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td (original)
>>>>> +++ cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td Tue Oct 5 23:57:26 2010
>>>>> @@ -98,6 +98,10 @@
>>>>> def ext_string_too_long : Extension<"string literal of length %0 exceeds "
>>>>> "maximum length %1 that %select{C90|ISO C99|C++}2 compilers are required to "
>>>>> "support">, InGroup<OverlengthStrings>;
>>>>> +def warn_ucn_escape_too_large : ExtWarn<
>>>>> + "character unicode escape sequence too long for its type">;
>>>>> +def warn_ucn_not_valid_in_c89 : ExtWarn<
>>>>> + "unicode escape sequences are only valid in C99 or C++">;
>>>>>
>>>>> //===----------------------------------------------------------------------===//
>>>>> // PTH Diagnostics
>>>>>
>>>>> Modified: cfe/trunk/lib/Lex/LiteralSupport.cpp
>>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/LiteralSupport.cpp?rev=115743&r1=115742&r2=115743&view=diff
>>>>> ==============================================================================
>>>>> --- cfe/trunk/lib/Lex/LiteralSupport.cpp (original)
>>>>> +++ cfe/trunk/lib/Lex/LiteralSupport.cpp Tue Oct 5 23:57:26 2010
>>>>> @@ -172,8 +172,8 @@
>>>>> SourceLocation Loc, Preprocessor &PP,
>>>>> bool wide,
>>>>> bool Complain) {
>>>>> - // FIXME: Add a warning - UCN's are only valid in C++ & C99.
>>>>> - // FIXME: Handle wide strings.
>>>>> + if (!PP.getLangOptions().CPlusPlus && !PP.getLangOptions().C99)
>>>>> + PP.Diag(Loc, diag::warn_ucn_not_valid_in_c89);
>>>>>
>>>>> // Save the beginning of the string (for error diagnostics).
>>>>> const char *ThisTokBegin = ThisTokBuf;
>>>>> @@ -218,13 +218,34 @@
>>>>> }
>>>>> if (wide) {
>>>>> (void)UcnLenSave;
>>>>> - assert(UcnLenSave == 4 &&
>>>>> - "ProcessUCNEscape - only ucn length of 4 supported");
>>>>> - // little endian assumed.
>>>>> - *ResultBuf++ = (UcnVal & 0x000000FF);
>>>>> - *ResultBuf++ = (UcnVal & 0x0000FF00) >> 8;
>>>>> - *ResultBuf++ = (UcnVal & 0x00FF0000) >> 16;
>>>>> - *ResultBuf++ = (UcnVal & 0xFF000000) >> 24;
>>>>> + assert((UcnLenSave == 4 || UcnLenSave == 8) &&
>>>>> + "ProcessUCNEscape - only ucn length of 4 or 8 supported");
>>>>> +
>>>>> + if (!PP.getLangOptions().ShortWChar) {
>>>>> + // Note: our internal rep of wide char tokens is always little-endian.
>>>>> + *ResultBuf++ = (UcnVal & 0x000000FF);
>>>>> + *ResultBuf++ = (UcnVal & 0x0000FF00) >> 8;
>>>>> + *ResultBuf++ = (UcnVal & 0x00FF0000) >> 16;
>>>>> + *ResultBuf++ = (UcnVal & 0xFF000000) >> 24;
>>>>> + return;
>>>>> + }
>>>>> +
>>>>> + // Convert to UTF16.
>>>>> + if (UcnVal < (UTF32)0xFFFF) {
>>>>> + *ResultBuf++ = (UcnVal & 0x000000FF);
>>>>> + *ResultBuf++ = (UcnVal & 0x0000FF00) >> 8;
>>>>> + return;
>>>>> + }
>>>>> + PP.Diag(Loc, diag::warn_ucn_escape_too_large);
>>>>> +
>>>>> + typedef uint16_t UTF16;
>>>>> + UcnVal -= 0x10000;
>>>>> + UTF16 surrogate1 = 0xD800 + (UcnVal >> 10);
>>>>> + UTF16 surrogate2 = 0xDC00 + (UcnVal & 0x3FF);
>>>>> + *ResultBuf++ = (surrogate1 & 0x000000FF);
>>>>> + *ResultBuf++ = (surrogate1 & 0x0000FF00) >> 8;
>>>>> + *ResultBuf++ = (surrogate2 & 0x000000FF);
>>>>> + *ResultBuf++ = (surrogate2 & 0x0000FF00) >> 8;
>>>>> return;
>>>>> }
>>>>> // Now that we've parsed/checked the UCN, we convert from UTF32->UTF8.
>>>>>
>>>>> Added: cfe/trunk/test/CodeGen/string-literal-short-wstring.c
>>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/string-literal-short-wstring.c?rev=115743&view=auto
>>>>> ==============================================================================
>>>>> --- cfe/trunk/test/CodeGen/string-literal-short-wstring.c (added)
>>>>> +++ cfe/trunk/test/CodeGen/string-literal-short-wstring.c Tue Oct 5 23:57:26 2010
>>>>> @@ -0,0 +1,14 @@
>>>>> +// RUN: %clang_cc1 -emit-llvm -fshort-wchar %s -o - | FileCheck %s
>>>>> +
>>>>> +int main() {
>>>>> + // This should convert to utf8.
>>>>> + // CHECK: internal constant [10 x i8] c"\E1\84\A0\C8\A0\F4\82\80\B0\00", align 1
>>>>> + char b[10] = "\u1120\u0220\U00102030";
>>>>> +
>>>>> + // CHECK: private constant [6 x i8] c"A\00B\00\00\00"
>>>>> + void *foo = L"AB";
>>>>> +
>>>>> + // This should convert to utf16.
>>>>> + // CHECK: private constant [10 x i8] c" \11 \02\C8\DB0\DC\00\00"
>>>>> + void *bar = L"\u1120\u0220\U00102030";
>>>>> +}
>>>>>
>>>>> Modified: cfe/trunk/test/CodeGen/string-literal.c
>>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/string-literal.c?rev=115743&r1=115742&r2=115743&view=diff
>>>>> ==============================================================================
>>>>> --- cfe/trunk/test/CodeGen/string-literal.c (original)
>>>>> +++ cfe/trunk/test/CodeGen/string-literal.c Tue Oct 5 23:57:26 2010
>>>>> @@ -1,7 +1,16 @@
>>>>> -// RUN: %clang_cc1 -emit-llvm %s -o -
>>>>> +// RUN: %clang_cc1 -emit-llvm %s -o - | FileCheck %s
>>>>>
>>>>> int main() {
>>>>> + // CHECK: internal constant [10 x i8] c"abc\00\00\00\00\00\00\00", align 1
>>>>> char a[10] = "abc";
>>>>>
>>>>> + // This should convert to utf8.
>>>>> + // CHECK: internal constant [10 x i8] c"\E1\84\A0\C8\A0\F4\82\80\B0\00", align 1
>>>>> + char b[10] = "\u1120\u0220\U00102030";
>>>>> +
>>>>> + // CHECK: private constant [12 x i8] c"A\00\00\00B\00\00\00\00\00\00\00"
>>>>> void *foo = L"AB";
>>>>> +
>>>>> + // CHECK: private constant [12 x i8] c"4\12\00\00\0B\F0\10\00\00\00\00\00"
>>>>> + void *bar = L"\u1234\U0010F00B";
>>>>> }
>>>>>
>>>>> Modified: cfe/trunk/test/Lexer/c90.c
>>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Lexer/c90.c?rev=115743&r1=115742&r2=115743&view=diff
>>>>> ==============================================================================
>>>>> --- cfe/trunk/test/Lexer/c90.c (original)
>>>>> +++ cfe/trunk/test/Lexer/c90.c Tue Oct 5 23:57:26 2010
>>>>> @@ -27,3 +27,7 @@
>>>>> "sdjflksdjf lksdjf skldfjsdkljflksdjf kldsjflkdsj fldks jflsdkjfds"
>>>>> "sdjflksdjf lksdjf skldfjsdkljflksdjf kldsjflkdsj fldks jflsdkjfds";
>>>>> }
>>>>> +
>>>>> +void test3() {
>>>>> + (void)L"\u1234"; // expected-error {{unicode escape sequences are only valid in C99 or C++}}
>>>>> +}
>>>>>
>>>>> Added: cfe/trunk/test/Lexer/wchar.c
>>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Lexer/wchar.c?rev=115743&view=auto
>>>>> ==============================================================================
>>>>> --- cfe/trunk/test/Lexer/wchar.c (added)
>>>>> +++ cfe/trunk/test/Lexer/wchar.c Tue Oct 5 23:57:26 2010
>>>>> @@ -0,0 +1,6 @@
>>>>> +// RUN: %clang_cc1 -fsyntax-only -fshort-wchar -verify %s
>>>>> +
>>>>> +void f() {
>>>>> + (void)L"\U00010000"; // expected-warning {{character unicode escape sequence too long for its type}}
>>>>> +}
>>>>> +
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cfe-commits mailing list
>>>>> cfe-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-commits mailing list
>>>> cfe-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>
>>>
>>> _______________________________________________
>>> cfe-commits mailing list
>>> cfe-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>
>>
>
More information about the cfe-commits
mailing list