[cfe-dev] [libc++] using std::wstring_convert and std::codecvt_utf16 to convert const char16_t* to wstring
Ryan Ericson
ryan.ericson at gmail.com
Fri Dec 16 15:09:42 PST 2011
Hi,
I'm learning about unicode support in C++ and I'm trying to convert
const char16_t* (UTF-16) to wstring (UCS4). Reading the standard (and
if I understand it correctly), it can be done through
std::codecvt_utf16:
"For the facet codecvt_utf16:
— The facet shall convert between UTF-16 multibyte sequences and UCS2
or UCS4 (depending on thesize of Elem) within the program."
So I tried to use std::wstring_convert to do the conversion by doing
the following:
#include <iostream>
#include <locale>
#include <codecvt>
#include <string>
using namespace std;
int main()
{
u16string s;
s.push_back('h');
s.push_back('e');
s.push_back('l');
s.push_back('l');
s.push_back('o');
wstring_convert<codecvt_utf16<wchar_t>, wchar_t> conv;
wstring ws = conv.from_bytes(reinterpret_cast<const char*> (s.c_str()));
wcout << ws << endl;
return 0;
}
Note: the explicit push_backs to get around the fact that my version
of clang (Xcode 4.2) doesn't have unicode string literals.
When the code is run, I get terminate exception from from_bytes. Am I
misunderstanding something and doing something illegal here? I was
thinking it should work because the const char* that I passed to
wstring_convert is UTF-16 encoded. I have also considered endianness
being the issue, but I have checked that it's not the case.
If that is indeed not going to work, what would be the best approach
to convert UTF-16 to UCS4 using standard C++11?
Thanks,
Ryan
More information about the cfe-dev
mailing list