[cfe-dev] Almost there...

Eli Friedman eli.friedman at gmail.com
Sun Jun 7 02:04:26 PDT 2009


On Sun, Jun 7, 2009 at 1:18 AM, Neil Booth<neil at daikokuya.co.uk> wrote:
>> > Something else to think about: how you track source locations if you
>> > iconv the whole file upfront.
>>
>> Source locations ought to just point into the converted buffer, I
>> think; we don't need to know the byte offsets in the original file.
>
> If you're going to quote the source then you'll need to convert
> back again - someone using an ISO-8859 terminal or Japanese terminal
> won't want mangled UTF-8 diagnostics.

We need to do this anyway: our localized Japanese diagnostics
(assuming we get some at some point) will most likely be stored in
UTF-8.  Also, the terminal doesn't necessarily use the same charset as
the source file; the source charset can be overridden with
-finput-charset (or at least, that's the intention).

> Charset conversion is not
> reversible in general, whether that's a practical issue is not
> clear.

I don't think that's an issue in practice; any reasonable charset can
be mapped to Unicode.

> Apple's "interesting" decision to encode their headers in neither
> ASCII nor UTF-8 will have implications too.

We probably won't bother to warn for invalid UTF-8 sequences in
comments; are the other places where there are issues?

-Eli



More information about the cfe-dev mailing list