[PATCH] D34793: [lit] Fix some convoluted logic around Unicode encoding, and de-duplicate across modules that used it.

Zachary Turner via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 28 23:16:49 PDT 2017


I'm curious Why you think file names shouldn't be Unicode. Seems pretty
reasonable to me?
On Wed, Jun 28, 2017 at 11:11 PM David L. Jones via Phabricator <
reviews at reviews.llvm.org> wrote:

> dlj marked an inline comment as done.
> dlj added a comment.
>
> In https://reviews.llvm.org/D34793#794883, @chapuni wrote:
>
> > @dlj Great, thanks!
> >
> > Seems it also fixes https://reviews.llvm.org/D34464.
>
>
> Interesting... to_string now has to fall back to str(bytes) in Python3
> when there is an invalid input. In that case, the resulting string looks
> more like the output of repr(), which is not what one would want for a
> filename.
>
> It's not clear to me why Python's behaviour of treating *filenames* as
> unicode is actually the right choice.
>
> Strictly speaking, I think the only well-defined filename encoding that
> covers all platforms targeted by Clang is the one defined by the Posix spec:
>
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_282
>
> (But of course, our supported OSes do support broader character sets.)
>
> I'll think more about what to_string should do, but I'll also leave a
> comment on the other review thread.
>
>
> Repository:
>   rL LLVM
>
> https://reviews.llvm.org/D34793
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170629/8bcb0173/attachment.html>


More information about the llvm-commits mailing list