[LLVMdev] Python bindings available.

Gordon Henriksen gordonhenriksen at mac.com
Sat May 10 09:27:30 PDT 2008


On May 10, 2008, at 05:44, Mahadevan R wrote:

> I'd like to announce the availability of Python bindings for LLVM.
>
> It is built over llvm-c, and currently exposes enough APIs to build an
> in-memory IR (and dump it!). It needs LLVM 2.3 latest and Python 2.5
> (2.4 should be sufficient, but I haven't tested). Tested only on
> Linux/i386.
>
> Would love to hear your comments.
>
> [Needless to say, it's all work in progress, but mostly it works as
> expected. More tests, documentation and APIs will follow.]
>
> It's all here: http://mdevan.nfshost.com/llvm-py.html


Hi Mahadevan,

Very nice! The OO syntax is pleasantly succinct. :)

> Constant.string(value, dont_null_terminate) -- value is a string
> Constant.struct(consts, packed) -- a struct, consts is a list of  
> other constants, packed is boolean

I did this in Ocaml initially, but found the boolean constants pretty  
confusing to read in code. I kept asking “What's that random true  
doing there?” Therefore, the bindings expose these as const_string/ 
const_stringz and const_struct/const_packed_struct respectively. I  
figure the user can always write her own in the (very) rare cases that  
it is necessary to conditionalize such things:

     let const_string_maybez nullterm =
       if nullterm then const_stringz else const_string

> Memory Buffer and Type Handles are not yet implemented.


:) Type handles in particular are very important. You can't form a  
recursive type without using them, so you can't build any sort of data  
structure.

> Builder wraps an llvm::IRBuilder object. It is created with the  
> static method new (builder = Builder.new()).

Uninitialized builders are very dangerous (they leak instructions if  
you use them), so you might want to add overloads for new in order to  
avoid boilerplate code.

> It can be positioned using the methodsposition(block, instr=None),  
> position_before(instr) and position_at_end(block).

There's an "IR navigator" concept you can implement to avoid writing  
so many overloads here. It provides a complete "position" or  
"iterator" concept. It's not entirely explicit in the C bindings—it  
would be memory-inefficient if it were. But you can build it atop them  
easily. It's useful whenever the C bindings have Before/AtEnd  
functions, and you can implement it wherever you see First/Last/Next/ 
Prev functions. The C bindings support this for functions, global  
variables, arguments, basic blocks, and instructions.

In Ocaml, we coded it up using a variant type, like (Before element |  
At_end parent). The basic operations for forward iteration are  
Parent.begin and Element.succ, which were implemented like this:

     Parent.begin =
       if this.first_element is null
         return At_end this
       else
         return Before this.first_element

     Element.succ =
       if this.next_element is null
         return At_end this.parent
       else
         return Before this.next_element

Then the user could build many IR navigation algorithms. The simplest  
one, "for each", is thus:

     for_elements(parent, callback) =
       pos = parent.begin
       loop
         match pos with
         | At_end _ -> break
         | Before element ->
             callback(element)
             pos = element.succ

     for_elements(parent, do_my_thing)

This representation was idiomatic in a functional language because  
it's compatible with recursion (you can translate for_elements into a  
tail recursive loop), but perhaps an enumerator class would be more  
idiomatic in Python:

     for_elements(parent, callback) =
       pos = parent.begin
       while pos.has_next()
         callback(pos.current)

The upshot, aside from being able to iterate the IR, was that it's  
easy to create builders anywhere with just one overload:

     // At the start or end of a BB:
     Builder.new(At_end bb)
     Builder.new(bb.begin)

     // Before or after a given instruction:
     Builder.new(Before instr)
     Builder.new(instr.succ)

This is actually more succinct than C++ because unlike  
BasicBlock::iterator, the position always knows its parent element  
(it's either parent or element.parent), so there's no need to pass it  
in separately as in builder.position(block, instr). Also, this could  
return a precise position:

> The current block is returned via the r/o property insert_block.


Finally, just as the C++ STL has reverse_iterator, it did prove  
necessary to have a separate (At_begin parent | After element) type in  
order to walk the IR backwards.

Cheers,
Gordon





More information about the llvm-dev mailing list