[LLVMdev] Python bindings in tree

Mon Mar 19 10:44:41 PDT 2012

On Sun, Mar 18, 2012 at 09:52:12PM -0700, Gregory Szorc wrote:
> The automatic generation of the Python ctypes interfaces using the Clang
> Python bindings is pretty friggin cool!

A nice side effect is that everything is added to the interface. So it
is easy to add a small proxy over the lib that shows which parts of
the llvm-c API that is exercised by the tests. (have that in my
bindings)

> > * 0004-Add-LLVMPrintModule-to-llvm-c.patch
> >   Adds a new LLVMPrintModule function which is similar to
> >   LLVMDumpModule but dumps to a string instead of stdout.
> > 
> > * 0005-Add-LLVMCreateMemoryBufferFromData-to-llvm-c.patch
> >   Adds LLVMCreateMemoryBufferFromData function.
> 
> These are desperately needed by the C API. Can you please submit them?

Will do!

> FWIW, all my work is at
> https://github.com/indygreg/llvm/tree/python_bindings/bindings/python.

Excellent! I'll try to see if I can adapt my bindings to your to fill
in the gaps.

There do indeed seem to be much overlap in our bindings. But there are
a few things where the design differs. If we should try to combine our
work I guess it would be a good idea to discuss these differences, to
make sure we work towards a common goal.

I think the main differences between our bindings are:

* Auto generated vs manual ctypes declarations.

  From your comment above I assume you would prefer auto generated too.

* Types inheriting from c_void_p vs having a ptr attribute.

  My bindings has for example Module (indirectly) inheriting from
  c_void_p, that way there is no "from_param" methods needed, and no
  extra attribute of the actual pointer.

  I'm not sure this is better. I might have done with separate pointer
  as you have if I started from scratch today.

* Use of constructor vs "new" static methods.

  When using the bindings one never initializes the class manually.
  Instead a "factory" method is used:

  mymod = Module.from_file(...)
  mymod = Module.from_data(...)
  mymod = Module.new("foo")
  ity = Type.int(32)

  instead of

  mymod = Module(file=...)
  mymod = Module(data=...)
  mymod = Module(name="foo")
  ity = IntType(32)

  I prefer this in, especially in the cases where there are many
  different ways to construct an item. Also many objects are not
  really created standalone. e.g a function is added:

  f = Module.add_function(FTy, "foo")

  and the Function constructor is never used. That way having the
  policy "never use constructor" to create objects makes it
  consistent.

  Also this makes it consistent with the old defuct llvm-py bindings.

  (partially this also is a consequence of the fact that my bindings
   inherits from c_void_p making it a bit messier)

* Directory layout

  Just minor thing.

  My bindings have python/bindings/lib/llvm
                                  /tests
				  /tools

  I do like having the tests outside the dir.

> Parts of Core.h still need love (especially the Value system). I'm doing
> some dynamic type creation at run-time using the Value hierarchy.
> Somewhat scary stuff, but it does seem to work. I really need a
> LLVMGetValueID() API to fetch llvm::Value::getValueID() to enable more
> efficient value casting.

I'm doing the very same thing in my bindings, and yes it is a bit
inefficient, but seems to work fine and should work fine as long as
classes are not moved in the hierarchy.

I use the same hierarchy at python level. And at python level
recursivly drills down into the correct subclass by doing LLVMIsA* for
the possible (direct) subclasses.

> From some discussion on #llvm, I think people
> are receptive to this. The main concern would be that the C API would be
> tied to a specific version of the shared library because the value ID
> enumeration aren't guaranteed for all of time. But, that contract is
> already broken, so I don't think it's a big deal: just something that
> needs to be documented. Of course, Python is a dynamic language, so if
> there were a C API that exposed the llvm::Value class hierarchy, we
> could always have Python dynamically create types at run-time :)

I guess we could have a separate valueid enum and a mapping between
llvm-c<->c++ valueid. IIRC the clang python bindings does that for for
something. That way there wont be any breakage if the c++ side is
changed.

 anders