[LLVMdev] [cfe-dev] Unicode path handling on Windows

Nikola Smiljanic popizdeh at gmail.com
Tue Sep 6 23:28:21 PDT 2011


The problem is not in the functions that return multibyte strings (the
multibyte string is coming from argv) but in the functions that can't handle
utf8 input on windows, such as ::open and ::stat.

llvm::sys::fs module assumes utf8 input and I don't think this is true for
windows. One solution would be to make the module work with multibyte
strings as I've done, and the other one would be to convert everything to
utf8 in which case a lot of code would have to change because we'd have to
convert from utf8 to utf16 whenever we call windows api functions. And note
that ::wstat has different argument type than ::stat, and this structure is
passed all around.

On Wed, Sep 7, 2011 at 2:22 AM, Bryce Cogswell <bryceco at yahoo.com> wrote:

> As was mentioned once before, the correct solution is to never use
> multibyte anywhere. Any Windows functions that currently return multibyte
> strings should be converted to their wide-string (unicode) equivalent, with
> the result converted to UTF-8.
>
>
> > From: Nikola Smiljanic <popizdeh at gmail.com>
> >
> > I think I got it this time. I realized that ::open and ::stat work just
> fine with multibyte paths on windows so there's no need to change this code.
> The only problem is llvm::sys::fs module which falsely assumes that input
> strings are UTF8 encoded when they are in fact multibyte strings.
> >
> > Now I really hope I haven't broken anything because llvm::sys::fs::exists
> is called in a number of places, but I'm guessing that none of the paths
> that are passed to it are really UTF8?
> >
> > I think entire llvm::sys::fs module should be changed to use
> MultibyteToUTF16 instead of UTF8ToUTF16 before calling windows api functions
> (unless somebody knows that we actually have UTF8 paths on windows somewhere
> in the code)?
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110907/39bfd5f6/attachment.html>


More information about the llvm-dev mailing list