[LLVMdev] Python bindings available.
Gordon Henriksen
gordonhenriksen at mac.com
Sat May 10 09:27:30 PDT 2008
On May 10, 2008, at 05:44, Mahadevan R wrote:
> I'd like to announce the availability of Python bindings for LLVM.
>
> It is built over llvm-c, and currently exposes enough APIs to build an
> in-memory IR (and dump it!). It needs LLVM 2.3 latest and Python 2.5
> (2.4 should be sufficient, but I haven't tested). Tested only on
> Linux/i386.
>
> Would love to hear your comments.
>
> [Needless to say, it's all work in progress, but mostly it works as
> expected. More tests, documentation and APIs will follow.]
>
> It's all here: http://mdevan.nfshost.com/llvm-py.html
Hi Mahadevan,
Very nice! The OO syntax is pleasantly succinct. :)
> Constant.string(value, dont_null_terminate) -- value is a string
> Constant.struct(consts, packed) -- a struct, consts is a list of
> other constants, packed is boolean
I did this in Ocaml initially, but found the boolean constants pretty
confusing to read in code. I kept asking “What's that random true
doing there?” Therefore, the bindings expose these as const_string/
const_stringz and const_struct/const_packed_struct respectively. I
figure the user can always write her own in the (very) rare cases that
it is necessary to conditionalize such things:
let const_string_maybez nullterm =
if nullterm then const_stringz else const_string
> Memory Buffer and Type Handles are not yet implemented.
:) Type handles in particular are very important. You can't form a
recursive type without using them, so you can't build any sort of data
structure.
> Builder wraps an llvm::IRBuilder object. It is created with the
> static method new (builder = Builder.new()).
Uninitialized builders are very dangerous (they leak instructions if
you use them), so you might want to add overloads for new in order to
avoid boilerplate code.
> It can be positioned using the methodsposition(block, instr=None),
> position_before(instr) and position_at_end(block).
There's an "IR navigator" concept you can implement to avoid writing
so many overloads here. It provides a complete "position" or
"iterator" concept. It's not entirely explicit in the C bindings—it
would be memory-inefficient if it were. But you can build it atop them
easily. It's useful whenever the C bindings have Before/AtEnd
functions, and you can implement it wherever you see First/Last/Next/
Prev functions. The C bindings support this for functions, global
variables, arguments, basic blocks, and instructions.
In Ocaml, we coded it up using a variant type, like (Before element |
At_end parent). The basic operations for forward iteration are
Parent.begin and Element.succ, which were implemented like this:
Parent.begin =
if this.first_element is null
return At_end this
else
return Before this.first_element
Element.succ =
if this.next_element is null
return At_end this.parent
else
return Before this.next_element
Then the user could build many IR navigation algorithms. The simplest
one, "for each", is thus:
for_elements(parent, callback) =
pos = parent.begin
loop
match pos with
| At_end _ -> break
| Before element ->
callback(element)
pos = element.succ
for_elements(parent, do_my_thing)
This representation was idiomatic in a functional language because
it's compatible with recursion (you can translate for_elements into a
tail recursive loop), but perhaps an enumerator class would be more
idiomatic in Python:
for_elements(parent, callback) =
pos = parent.begin
while pos.has_next()
callback(pos.current)
The upshot, aside from being able to iterate the IR, was that it's
easy to create builders anywhere with just one overload:
// At the start or end of a BB:
Builder.new(At_end bb)
Builder.new(bb.begin)
// Before or after a given instruction:
Builder.new(Before instr)
Builder.new(instr.succ)
This is actually more succinct than C++ because unlike
BasicBlock::iterator, the position always knows its parent element
(it's either parent or element.parent), so there's no need to pass it
in separately as in builder.position(block, instr). Also, this could
return a precise position:
> The current block is returned via the r/o property insert_block.
Finally, just as the C++ STL has reverse_iterator, it did prove
necessary to have a separate (At_begin parent | After element) type in
order to walk the IR backwards.
Cheers,
Gordon
More information about the llvm-dev
mailing list