[LLVMdev] [cfe-dev] Unicode path handling on Windows

Nikola Smiljanic popizdeh at gmail.com
Thu Sep 1 13:17:33 PDT 2011


AFAIK Clang internals do assume utf8, and llvm::sys::path converts strings
to utf16 on windows and calls W API functions.

If somebody would like to take a look at my changes and comment on them.
Here's a brief explanation of what I did:

- Convert argv to utf8 using current system locale for win32 (this is done
as soon as possible inside ExpandArgv). This makes the driver happy since
calls to llvm::sys::path::exists succeed.
- Change calls to ::open (inside FileSystemStatCache and MemoryBuffer) to
::_wopen on win32 by converting the path to utf16.
- In order to do the conversions I had to expose two functions, one of them
was already there but wasn't visible, the other one was added by me

Known issues:

- I should probably use LLVM_ON_WIN32 instead of WIN32 but this macro isn't
defined inside FileSystemStatCache and MemoryBuffer for some reason. Both of
these files have an #ifdef section that deals with O_BINARY so maybe these
two sections should be consolidated?
- Functions convert_multibyte_to_utf8 and convert_utf8_to_utf16 have
definitions only on windows so every other platform is currently broken.

On Thu, Sep 1, 2011 at 5:44 PM, Ruben Van Boxem <vanboxem.ruben at gmail.com>wrote:

> Isn't it more straightforward to use utf-8 internally and use the
> conversion functions provided by the win32 API when calling other win32 API
> functions, and always call the wide versions of the win32 functions. Full
> compatibility guaranteed, and one encoding internally.
>
> Ruben
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110901/b724988b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unicode_path_clang.patch
Type: application/octet-stream
Size: 1811 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110901/b724988b/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unicode_path_llvm.patch
Type: application/octet-stream
Size: 2973 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110901/b724988b/attachment-0001.obj>


More information about the llvm-dev mailing list