[cfe-dev] C99/C++ UCN (Universal Character Name) Support

steve naroff snaroff at apple.com
Fri Mar 27 18:16:02 PDT 2009


On Mar 27, 2009, at 9:02 PM, Eli Friedman wrote:

> On Fri, Mar 27, 2009 at 5:45 PM, steve naroff <snaroff at apple.com>  
> wrote:
>> Part of implementing this is converting UTF-16 (\u) and UTF-32 (\U)  
>> to
>> UTF-8 (for insertion into a C-string, say).
>
> It's not very hard; one version of the formula is available at
> http://en.wikipedia.org/wiki/UTF-8.  And UTF-16 isn't really relevant
> here; \u denotes a Unicode code point, not a UTF-16 code unit.

>> Unfortunately, Unix doesn't appear to have any standard support for
>> this type of conversion (which surprised me).
>
> You could use iconv, although that's overkill here...
>

I agree. I believe this is what GCC uses.

One of the Unicode guy's within Apple pointed me to...

http://www.unicode.org/Public/PROGRAMS/CVTUTF/ConvertUTF.c

...which looks good to me.

snaroff

> -Eli




More information about the cfe-dev mailing list