[PATCH] D26082: Support for Python 3 in libclang python bindings

Jonathan B Coe via cfe-commits cfe-commits at lists.llvm.org
Wed Nov 2 13:29:04 PDT 2016

jbcoe added inline comments.

Comment at: bindings/python/clang/cindex.py:77
+# Python 3 strings are unicode, translate them to/from utf8 for C-interop
+if type(u"") == str:
+    class c_string_p(c_char_p):
mgorny wrote:
> What is the Python3 version you're aiming to support? If I recall correctly, `u""` is allowed (again) since 3.4. And even then, the condition looks weird and makes me think a while before I figure out what it's supposed to mean.
Replaced with a simple version check

Comment at: bindings/python/clang/cindex.py:518
-        for i in xrange(0, count):
+        for i in range(0, count):
             token = Token()
mgorny wrote:
> compnerd wrote:
> > IIRC, `range` and `xrange` did have some performance difference.  This would slow down the bindings on python 2.  The difference is obviously not immediately visible unless count is very high.  I wonder if we can do anything here to detect the python version and dispatch to `xrange` in python 2.
> The difference is that `range()` used to construct a complete list in Python 2. In Python 3, `xrange()` (that uses iterator) got renamed to `range()`.
> If you want to avoid performance impact (not sure if it's really measurable here), you can do something alike C for loop:
>     i = 0
>     while i < count:
>         #...
>         i += 1
> It's not really idiomatic Python though. OTOH, it won't take more lines than the conditional.
I've defined `xrange` to be `range` for python3. A bit confusing but then we can use `xrange` uniformly.

Comment at: bindings/python/clang/cindex.py:623
         """Return all CursorKind enumeration instances."""
-        return filter(None, CursorKind._kinds)
+        return [x for x in CursorKind._kinds if x]
mgorny wrote:
> Why are you changing this? The old version seems to be correct for Python3.
`filter` has changed in Python 3 and the replaced code does not behave the same as this simple list comprehension. I've not delved deeper into why but see test failures without this change.

Comment at: bindings/python/clang/cindex.py:2573
         if len(args) > 0:
-            args_array = (c_char_p * len(args))(* args)
+            args_array = (c_string_p * len(args))()
+            for i,a in enumerate(args):
mgorny wrote:
> I may be wrong but I think you could use a list comprehension here.
>     args_array = (c_string_p * len(args))([c_string_p(x) for x in args])
> You can also try without `[]` to avoid the overhead of constructing list, if the function can take an iterator.
I can't get this to work with a comprension or generator, I agree either would be an improvement.

Comment at: bindings/python/tests/cindex/test_translation_unit.py:62
 def test_unsaved_files_2():
-    import StringIO
+    try:
+        from StringIO import StringIO
mgorny wrote:
> Could you try inverting this? Python 2.7 already has `io.StringIO`, so that branch is much more likely to work.
Python 2.7's `io.StringIO` needs a unicode string.


More information about the cfe-commits mailing list