[LLVMdev] [cfe-dev] Unicode path handling on Windows

NAKAMURA Takumi geek4civic at gmail.com
Thu Sep 1 20:21:24 PDT 2011


2011/9/2 Ruben Van Boxem <vanboxem.ruben at gmail.com>:
>> In principle, IMHHHO;
>>
>>  - argv should be treated as "blackbox" byte stream.
>>  - Don't assume "wmain(argc, wchar_t **argv)". mingw does not have
>> one. Then, argv must be presented as the default codepage.
>
> Correction: I believe MinGW-w64 has a Unicode startup and thus support for
> wmain (but of course it would be better to shift this to strict API
> functions)

Good to hear. Frankly speaking, though, I don't know little knowledge
to wmain() scheme...

>> We should do in llvm;
>>
>>  - Treat pathstring in argv as blackbox. Never parse
>> (char*)pathstring without any knowledge.
>>  - UTF8 would be useless on win32. Win32 does not manipulate utf8
>> implicitly in anywhere.
>>  - Path API should hold pathstring as API-native form (bytestream on
>> unix, UCS2 wchar_t on win32).
>>  - Path should be manipulated as API-native form as possible.
>
> Isn't it more straightforward to use utf-8 internally and use the conversion
> functions provided by the win32 API when calling other win32 API functions,
> and always call the wide versions of the win32 functions. Full compatibility
> guaranteed, and one encoding internally.

I could propose one if conversion of ansi->utf8 would be supported by win32.
Now, I rethought it might be an option to hold utf8 internally.

...Takumi




More information about the llvm-dev mailing list