[cfe-dev] Unicode path handling on Windows

Eli Friedman eli.friedman at gmail.com
Wed Aug 31 11:17:07 PDT 2011


On Wed, Aug 31, 2011 at 10:58 AM, Nikola Smiljanic <popizdeh at gmail.com> wrote:
> _wopen expects wchar_t* and the only visible function for conversion to
> utf16 is ConvertUTF8toUTF32 which converts to unsigned shorts.

If you're in #ifdef WIN32 code, just use ConvertUTF8toUTF16 and
reinterpret_cast from unsigned short* to wchar_t*.

-Eli

> There is a
> function that does exactly what I need called UTF8ToUTF16, but it's inside
> an anonymous namespace inside windows version of PathV2.inc
> I could solve this in a number of ways, but I'm not sure which one is
> preferred inside Clang codebase?
> On Thu, Aug 25, 2011 at 1:25 AM, Ruben Van Boxem <vanboxem.ruben at gmail.com>
> wrote:
>>
>> Op 25 aug. 2011 00:08 schreef "Nikola Smiljanic" <popizdeh at gmail.com> het
>> volgende:
>>
>> >
>> > I'm trying to fix unicode file handling on
>> > windows http://llvm.org/bugs/show_bug.cgi?id=10348. This currently doesn't
>> > work because argv is encoded as multibyte string (clang project is
>> > configured this way).
>> >
>> > Michael suggested converting command line to utf8, and this indeed
>> > solves the error that the driver emits, but there is another check in
>> > CompilerInstance that fails because FileSystemStatCache::get calls ::open
>> > and I'm guessing that this function is not smart enough to handle utf8 path
>> > on windows? Any ideas?
>>
>> It's not smart enough no, but you can use _wfopen instead. Note that all
>> of its arguments are wchar_t*
>>
>> Ruben
>>
>> >
>> > I have one more question. I added MultibyteToUTF8 function to PathV2.inc
>> > (windows version) and now I'd like to call it from ExpandArgv (driver.cpp)
>> > but this code is platform specific and isn't visible (function is
>> > inside anonymous namespace). I could create a wrapper function that calls
>> > this function on windows and does nothing on other platforms. Is this the
>> > way to go, and where should I put it (llvm::sys::fs, llvm::sys::path or
>> > somewhere else)?
>> >
>> > ---------- Forwarded message ----------
>> > From: Michael Spencer <bigcheesegs at gmail.com>
>> > Date: Sat, Jul 16, 2011 at 12:32 AM
>> > Subject: Re: Question regarding Clang path handling on Windows
>> > To: Nikola Smiljanic <popizdeh at gmail.com>
>> >
>> >
>> > On Thu, Jul 14, 2011 at 8:25 AM, Nikola Smiljanic <popizdeh at gmail.com>
>> > wrote:
>> > > Hi Michael I'd like to fix this bug if I
>> > > can http://llvm.org/bugs/show_bug.cgi?id=10348. Started looking around
>> > > and I
>> > > think I know where the problem is (your name showed up in svn log for
>> > > PathV2.inc), but I'm not sure what is the right way to solve it.
>> > > Namely, the
>> > > check in Driver::BuildActions (line 772) fails. It seems
>> > > that function llvm::sys::fs::exists tries to convert input string from
>> > > utf8
>> > > to utf16, but clang.exe  is compiled with Multibyte Character Sets.
>> > > This
>> > > means that the conversion will succeed when you pass an ANSI string
>> > > that is
>> > > also a valid utf8 string. But if you try to pass in
>> > > some Chinese characters
>> > > you'll get a single byte character string that is interpreted using
>> > > the
>> > > current windows locale, and in this case conversion from utf8 to utf16
>> > > will
>> > > fail (character values are negative). So my question is whether this
>> > > function should do the conversion at all, maybe there are other places
>> > > in
>> > > the code that can call it with utf8 input that is obtained from some
>> > > windows
>> > > function? In this particular case, conversion should be done using the
>> > > current locale. I'd like to hear what you think?
>> >
>> > This was an oversight on my part. I assumed the command line would be
>> > in utf8 for some reason. Clang internals currently assume utf8, so the
>> > correct fix is to convert the command line to utf8 first. I'll look
>> > into adding this.
>> >
>> > - Michael Spencer
>> >
>> >
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > cfe-dev at cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>> >
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>




More information about the cfe-dev mailing list