[LLVMdev] request for windows unicode support

Jochen Wilhelmy j.wilhelmy at arcor.de
Fri Nov 26 01:00:15 PST 2010


On 25.11.2010 23:56, Michael Spencer wrote:
> On Nov 25, 2010, at 5:01 PM, Jochen Wilhelmy <j.wilhelmy at arcor.de 
> <mailto:j.wilhelmy at arcor.de>> wrote:
>
>> Hi!
>>
>> Of course nobody wants to implement unicode support for windows
>> because windows should support an utf8-locale and windows is obsolete
>> anyway ;-)
>>
>> But there is a simple solution: use boost::filesystem::path 
>> everywhere you
>> use file names and paths, for example in clang::FileManager::getFile.
>> With version 3 opening a file is easy: std::fstream file(path.c_str()).
>> Internally boost::filesystem::path uses the native encoding which is
>> utf16 for windows but you won't notice it since it recodes 8 bit strings
>> automatically (which is no-op on unix and macos).
>>
>> If you don't want to become dependent on boost, I suggest reimplementing
>> the most important features always using 8 bit strings and then have
>> something
>> like this:
>>
>> #ifdef HAVE_BOOST
>> namespace fs = boost::filesystem;
>> #else
>> // simple implementation here
>> #endif
>>
>> -Jochen
>
> This happens to be very close to the code I'm working on now (I assume 
> this post was prompted by my patches). I'll be adding unicode support 
> to the Windows implementation, however, paths will remain utf-8 
> encoded outside of System.

No, this post was prompted since I switched to boost::filesystem version 
3 in my own code and llvm/clang 2.8
was the only lib with no unicode support on windows.
Will your code be api compatible to boost::filesystem? The reason for 
this is that maybe boost::filesystem
will become part of the standard and it is possible to imbue() a locale 
on boost::filesystem.
While this feature is not needed on unix/macos it gives you global 
control whether you want to use ansi or
unicode on windows.
If you implement your own code with always utf-8 this may break 
compatibility with windows ansi
encoding if you don't take care and why reinvent the wheel? maybe you 
could even copy/paste the
boost implementation and use the #ifdef HAVE_BOOST approach.

-Jochen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101126/ff62f531/attachment.html>


More information about the llvm-dev mailing list