[PATCH] D56754: Add Support for Creating and Deleting Unicode Files and Directories in Lit
serge via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 24 02:43:12 PST 2019
serge-sans-paille added inline comments.
================
Comment at: utils/lit/lit/util.py:444
+ # convert it to ascii
+ return text if isinstance(text, bytes) else text.encode('utf-8')
----------------
jmittert wrote:
> serge-sans-paille wrote:
> > Bad news: I've been discussing with a Python Core Dev and he acknowledge this function to be ultra dangerous. We should find a way to have everything right without this tricky call.
> >
> > My take would be to first have the whole script work in unicode mode (i.e. in Python3) and then do the minimal work to have it work on Python2. Does that makes sense to you?
> Let me check that I understand correctly. Right now, lit uses strings on python3 which are unicode aware, and strings/bytes on python2 which are not unicode aware which causes issues when utf8 characters are passed to os.path on Windows.
>
> What about alternatively adding a to_unicode a la the to_string and to_bytes in util.py which would return str on python3, and unicode on python2?
>
> Then adding something like
> ```
> s = some_string
> if windows:
> s = to_unicode(s)
> os.path.some_func(s)
> ```
> before each call to os.path.some_func?
Your first statement is correct. However I don't understand why there's anything windows specific in there. Let think about it this way: in Python2, strings where used to represent two different kind of objects : raw bytes, and unicode string. They are different types in Python3. So my advice would be:
1. have the code work in Python3, inserting bytes to string conversion whenever needed. These conversion should not be platform specific.
2. try to run the same code under Python2 and see where it fails.
Does that look like a reasonable path to you?
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D56754/new/
https://reviews.llvm.org/D56754
More information about the llvm-commits
mailing list