[cfe-commits] [patch] Unicode character literals for UTF-8 source encoding

Seth Cantrell seth.cantrell at gmail.com
Wed Jan 11 20:35:29 PST 2012


Alright, characters for which the appropriate encoding can't be represented as a single value of the appropriate type are now disallowed in character literals.

so now '\u2031' is not allowed (not even in C where the literal has type int which could represent the value) and L'\U00010000' is not allowed. Also replacing these UCNs with the actual characters results in exactly the same behavior.

- Seth


On Jan 10, 2012, at 3:59 PM, Eli Friedman wrote:

> On Tue, Jan 10, 2012 at 4:05 AM, Seth Cantrell <seth.cantrell at gmail.com> wrote:
>> whoops, that should be "anything that indicates '\U0010FFFD' isn't perfectly valid"
>> 
>> Accepting larger Unicode escapes is not new with this patch (I tried the clang installed with Xcode 4.2, Apple clang version 3.0 (tags/Apple/clang-211.12) (based on LLVM 3.0svn), and `int i = '\U001F306';` gives i the value 0x001F306. Although I don't have a use-case or anything my preference is to allow the larger unicode escapes.
>> 
>> If you want them excluded just let me know the ranges.
> 
> Accepting it and doing something different from gcc seems likely to
> cause issues if someone is accidentally depending on gcc's behavior.
> I think we should either reject it or do the same thing as gcc.
> 
> -Eli
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Improves-support-for-Unicode-in-character-literals.patch
Type: application/octet-stream
Size: 13345 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120111/21318f73/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Fix-char-literal-types-in-C.patch
Type: application/octet-stream
Size: 1501 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120111/21318f73/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-stop-claiming-unicode-escape-sequences-are-too-long-.patch
Type: application/octet-stream
Size: 862 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120111/21318f73/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-Add-and-update-tests-for-character-literals.patch
Type: application/octet-stream
Size: 7493 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120111/21318f73/attachment-0003.obj>


More information about the cfe-commits mailing list