[cfe-commits] [patch] Unicode character literals for UTF-8 source encoding
Eli Friedman
eli.friedman at gmail.com
Mon Jan 9 20:56:57 PST 2012
On Mon, Jan 9, 2012 at 8:05 PM, Seth Cantrell <seth.cantrell at gmail.com> wrote:
> Updated patches. There's an extra one for the change to ActOnCharacterConstant.
>
+ // FIXME: unify the logic for determining the type of the char literal
+ // instead of repeating it here and in ActOnCharacterConstant
+ int available_bits;
+ if (tok::wide_char_constant == Kind)
+ available_bits = PP.getTargetInfo().getWCharWidth();
+ else if (tok::utf16_char_constant == Kind)
+ available_bits = PP.getTargetInfo().getChar16Width();
+ else if (tok::utf32_char_constant == Kind)
+ available_bits = PP.getTargetInfo().getChar32Width();
+ else if (!PP.getLangOptions().CPlusPlus || isMultiChar())
+ available_bits = PP.getTargetInfo().getIntWidth();
+ else
+ available_bits = PP.getTargetInfo().getCharWidth();
Actually, thinking about it a bit more, I'm still not sure this is
actually what we want to do; do we really want to allow '\U0010FFFD'
in C? I mean, strictly speaking, it's implementation-defined, but I
don't think there's any precedent for the value we use with this
patch.
-Eli
More information about the cfe-commits
mailing list