[PATCH] D56754: Add Support for Creating and Deleting Unicode Files and Directories in Lit

Jason Mittertreiner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 22 10:20:50 PST 2019


jmittert marked an inline comment as done.
jmittert added inline comments.


================
Comment at: utils/lit/lit/util.py:440
+            # converted to UTF-16
+            return text.decode('utf-8') if isinstance(text, bytes) else text
+
----------------
efriedma wrote:
> The inner "if" is redundant; on Python2, `bytes` is an alias for `str`, so both sides have the same meaning.
> 
> This function seems dangerous in the sense that it isn't clear what type of data it's expecting as input. Probably you should have separate functions for each of the possible cases: input is `bytes`, input is `str`, or input is Python2 `unicode`/Python3 `str`.  (Not sure which of those you actually need, but those are the reasonable possibilities, I think.)
I'm fairly certain the if isn't redundant: on python 2, if we get `bytes`(/`str`) we want to decode that to a `unicode`. In python 2, `.decode` returns a `unicode`, not a `str` like in python 3.

Part of the reason it has confusing inputs is because what input it gets depends on they python version. It gets bytes/str on python 2 and str on python 3. I figured it was safer to handle all text cases (python 2 `bytes`/`str`/`unicode`, python 3 `str`/`bytes`). At least that way it's idempotent. I do agree it's dangerous in that the type it outputs is different based on the platform and version, but I've tried to limit use of it to only right at the edge before os calls.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56754/new/

https://reviews.llvm.org/D56754





More information about the llvm-commits mailing list