[cfe-dev] Wide strings and clang::StringLiteral.
Gordon Henriksen
gordonhenriksen at me.com
Fri Dec 5 11:00:35 PST 2008
On Dec 5, 2008, at 13:48, Eli Friedman wrote:
> On Fri, Dec 5, 2008 at 4:41 AM, Neil Booth <neil at daikokuya.co.uk>
> wrote:
>> so why not just require ASCII supersets like
>> the standard does (for ASCII hosts)? Then your caret diagnostics
>> keep working too, and special-casing the extra characters is straight
>> forward, even for SJIS.
>
> The issue with SJIS in particular is that sometimes ASCII bytes don't
> actually represent ASCII. Although, looking at the character set more
> carefully, it looks like that doesn't actually affect the lexer unless
> we allow Japanese characters in identifiers... that's kind of nice.
>
> I don't see where the standard requires an ASCII superset; it
> certainly requires a lot of characters from ASCII, but EBCDIC, for
> example, appears to be an legal source character set. Oddly, though,
> UTF-16 appears to be an illegal source character set... that seems
> slightly strange to me, since nothing really depends on the source
> character set.
UTF-16 sources are not unheard of in Windows-only codes. cl handles
them transparently, so it might unwise to make any design decisions
which preclude them, regardless what the standard says on the matter.
cl supports encoding sniffing to accomplish this, since system headers
are in ASCII.
— Gordon
More information about the cfe-dev
mailing list