[cfe-dev] [libc++] using std::wstring_convert and std::codecvt_utf16 to convert const char16_t* to wstring
Howard Hinnant
hhinnant at apple.com
Fri Dec 16 16:13:41 PST 2011
On Dec 16, 2011, at 6:09 PM, Ryan Ericson wrote:
> Hi,
>
> I'm learning about unicode support in C++ and I'm trying to convert
> const char16_t* (UTF-16) to wstring (UCS4). Reading the standard (and
> if I understand it correctly), it can be done through
> std::codecvt_utf16:
>
> "For the facet codecvt_utf16:
> — The facet shall convert between UTF-16 multibyte sequences and UCS2
> or UCS4 (depending on thesize of Elem) within the program."
>
> So I tried to use std::wstring_convert to do the conversion by doing
> the following:
>
> #include <iostream>
> #include <locale>
> #include <codecvt>
> #include <string>
>
> using namespace std;
>
> int main()
> {
> u16string s;
>
> s.push_back('h');
> s.push_back('e');
> s.push_back('l');
> s.push_back('l');
> s.push_back('o');
>
> wstring_convert<codecvt_utf16<wchar_t>, wchar_t> conv;
> wstring ws = conv.from_bytes(reinterpret_cast<const char*> (s.c_str()));
>
> wcout << ws << endl;
>
> return 0;
> }
>
> Note: the explicit push_backs to get around the fact that my version
> of clang (Xcode 4.2) doesn't have unicode string literals.
>
> When the code is run, I get terminate exception from from_bytes. Am I
> misunderstanding something and doing something illegal here? I was
> thinking it should work because the const char* that I passed to
> wstring_convert is UTF-16 encoded. I have also considered endianness
> being the issue, but I have checked that it's not the case.
> If that is indeed not going to work, what would be the best approach
> to convert UTF-16 to UCS4 using standard C++11?
Cubbi beat me to figuring this out by 58 minutes. ;-)
Howard
More information about the cfe-dev
mailing list