[PATCH] D45550: Use GetArgumentVector to retrieve the utf-8 encoded arguments on all platforms

Rui Ueyama via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 11 21:30:12 PDT 2018


ruiu added a comment.

That website recommends we handle all strings as UTF-8 whether externally or internally, and if you need to handle strings encoded not in UTF-8, you convert them to UTF-8 first. I'd agree that principle.

But the thing I'm pointing out is different from that. If your Windows code page setting is ISO-8859-1, for example, I think the command line arguments are supposed to be encoded in ISO-8859-1. Passing a UTF-8 string as an argument is not valid if that's the case. Look at lld/test/ELF/format-binary-non-ascii.s. It contains raw UTF-8 string as a command line argument. That test doesn't make sense unless your code page is UTF-8.


Repository:
  rLLD LLVM Linker

https://reviews.llvm.org/D45550





More information about the llvm-commits mailing list