[PATCH] Add writeFileWithSystemEncoding to LibLLVMSupport

Rafael Auler rafaelauler at gmail.com
Fri Aug 15 12:43:11 PDT 2014


Hi Rafael and Reid,

Thanks for sharing your opinion, I appreciate it. I organized a table with the testings I made in my Windows system. I encoded a response file with international characters in different encodings and tested them on different tools. Here are my findings:

| Tool | UTF8-no-BOM | UTF8-BOM | UTF16-BOM | Current Code Page (ISO-8859-1 in my system) |
| GCC 4.8.1 MinGW | Fail | Fail | Fail | Works |
| LD 2.24 MinGW | Fail | Fail | Fail | Works |
| GCC 4.8.3 Cygwin | Works | Fail | Fail | Fail |
| LD 2.24.51 Cygwin | Works | Fail | Fail | Fail |

For Cygwin, I used bash and, for MinGW programs, the Windows command prompt.

This led me to believe that:

* GNU tools on Cygwin or any UNIX system accepts plain UTF8 without any BOM. Using BOM will confuse the tool. No other encoding is understood.
* GNU tools on MinGW only accepts the current code page of the system. Using any other encoding, with or without BOM, is not understood.

That's why I designed my patch the way it is. On Windows native or MinGW, it uses current CP or UTF16 with BOM (for MSVC tools). On UNIX (including cygwin), it always uses UTF8 without BOM.

I supposed that all GNU tools work in this way and extended the information on all Clang Tool objects related to GNU to follow this as well. This is the meaning of using the enum member ResponseFileSupport::FullWithoutUTF16 in all GNU tools (no UTF16 means that it will use UTF8 on UNIX and Current code page on Windows).

I will update the comments in this patch to make this clear. I will also open a bug in binutils requesting them to implement UTF8/UTF16 response files on Windows/MinGW.

Best regards,
Rafael Auler

http://reviews.llvm.org/D4896






More information about the llvm-commits mailing list