[PATCH] D30773: Make git-clang-format python 3 compatible

Eric Fiselier via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Mar 11 02:06:32 PST 2017


EricWF added inline comments.


================
Comment at: tools/clang-format/git-clang-format:306
+    try:
+        return to_string(bytes.decode('utf-8'))
+    except AttributeError: # 'str' object has no attribute 'decode'.
----------------
mgorny wrote:
> EricWF wrote:
> > mgorny wrote:
> > > This logic looks really weird to me. What is the purpose of having both `to_string()` and `convert_string()`? Why do `to_bytes()` and `to_string()` use `isinstance()` to recognize types, and here you rely on exceptions? Why is `to_string()` called after decoding?
> > `to_string` is called after decoding because in python2 the result of decoding is a `unicode` type, and we need to encode it a `str` type. Hense to_string.
> No offense intended but this sounds really horrible. In modern Python, everything is either `bytes` or `unicode`. The difference basically is that `str` in py2 was pretty much `bytes` (except that `bytes` explicitly removes some operations that are unsuitable for bytestrings), and that `str` in py3 is equivalent to `unicode` before.
> 
> So if you are specifically converting to `str`, it means that you want to have two distinct types in py2/py3. Which really sounds like you're doing something wrong.
No offence taken. I had to do way to much work to answer your questions accurately which means the code is way too complicated or non-obvious.

However there are a couple of things you got wrong. In python2 everything is normally `bytes`, `str`, or `unicode`. The `to_string` method converts both `unicode` and `bytes` to the type `str`. 

> So if you are specifically converting to str, it means that you want to have two distinct types in py2/py3. Which really sounds like you're doing something wrong.

I don't think having the Python library return different types in py2/py3 means something is wrong. In fact that's exactly what's happening and  that's exactly what `convert_string` is trying to fix. In python 2 `stdout, stderr, and stdin` return strings, but in python 3 they return `bytes`. `convert_string` is meant to transform these different types into `str`.

Regardless I'll remove the call to `to_bytes` from inside `to_string` because it's too confusing.


https://reviews.llvm.org/D30773





More information about the llvm-commits mailing list