[LLVMbugs] [Bug 24342] New: std::char_traits<char16_t>::eof() returns valid code unit
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Mon Aug 3 08:16:26 PDT 2015
https://llvm.org/bugs/show_bug.cgi?id=24342
Bug ID: 24342
Summary: std::char_traits<char16_t>::eof() returns valid code
unit
Product: libc++
Version: 3.6
Hardware: Macintosh
OS: All
Status: NEW
Severity: normal
Priority: P
Component: All Bugs
Assignee: unassignedclangbugs at nondot.org
Reporter: david_work at me.com
CC: llvmbugs at cs.uiuc.edu, mclow.lists at gmail.com
Classification: Unclassified
[char.traits.specializations.char16_t] ยง21.2.3.2/3 says,
"The member eof() shall return an implementation-defined constant that cannot
appear as a valid UTF-16 code unit."
In libc++ it returns 0xDFFF, which is a valid second half of a surrogate pair.
Surrogate pairs are only needed outside the basic multilingual plane, so it
won't often be seen, but characters like U+123FF are valid and encoded by
0xDFFF.
On the other hand, U+FFFF is a "noncharacter," "intended for process-internal
uses" similarly to the byte order mark (which happens to be the preceding code
point U+FFFE). (http://unicode.org/charts/PDF/UFFF0.pdf) U+FFFF is used by most
other environments, it is the value under libstdc++, and it coincides with WEOF
when wchar_t is UTF-16.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150803/96aa45e6/attachment.html>
More information about the llvm-bugs
mailing list