[PATCH] D56754: Add Support for Creating and Deleting Unicode Files and Directories in Lit

serge via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 24 02:43:12 PST 2019


serge-sans-paille added inline comments.


================
Comment at: utils/lit/lit/util.py:444
+    # convert it to ascii
+    return text if isinstance(text, bytes) else text.encode('utf-8')
----------------
jmittert wrote:
> serge-sans-paille wrote:
> > Bad news: I've been discussing with a Python Core Dev and he acknowledge this function to be ultra dangerous. We should find a way to have everything right without this tricky call.
> > 
> > My take would be to first have the whole script work in unicode mode (i.e. in Python3) and then do the minimal work to have it work on Python2. Does that makes sense to you?
> Let me check that I understand correctly. Right now, lit uses strings on python3 which are unicode aware, and strings/bytes on python2 which are not unicode aware which causes issues when utf8 characters are passed to os.path on Windows.
> 
> What about alternatively adding a to_unicode a la the to_string and to_bytes in util.py which would return str on python3, and unicode on python2?
> 
> Then adding something like
> ```
> s = some_string
> if windows:
>     s = to_unicode(s)
> os.path.some_func(s)
> ```
> before each call to os.path.some_func?
Your first statement is correct. However I don't understand why there's anything windows specific in there. Let think about it this way: in Python2, strings where used to represent two different kind of objects : raw bytes, and unicode string. They are different types in Python3. So my advice would be:

1. have the code work in Python3, inserting bytes to string conversion whenever needed. These conversion should not be platform specific.

2. try to run the same code under Python2 and see where it fails.

Does that look like a reasonable path to you?


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56754/new/

https://reviews.llvm.org/D56754





More information about the llvm-commits mailing list