[LLVMdev] Python bindings available.

Mon May 12 09:00:48 PDT 2008

On Mon, May 12, 2008 at 7:55 PM, Gordon Henriksen
<gordonhenriksen at mac.com> wrote:
> On May 12, 2008, at 02:58, Mahadevan R wrote:
>
>  >> Consider the case where a function creates and populates a Module,
>  >> stuffs it in an ExistingModuleProvider for the JIT, then returns
>  >> the ModuleProvider, dropping direct reference to the Module.
>  >> (ModuleProvider takes ownership of the Module.) I presume that your
>  >> Python object is under the impression it owns the Module; when that
>  >> goes out of scope, its refcount goes to zero and it invokes its
>  >> dtor, disposing of the Module. D'oh— now the ModuleProvider has a
>  >> dangling pointer. :)
>  >
>  > Ah. Good one. Would the following fix it?
>  >
>  > 1) Have ModuleProvider maintain a reference to the Module it owns,
>  >     so that the ref count is at least 1 at any time. This is easily
>  > done.
>  >     The only thing left is when an MP goes away, the module's dtor
>  >     will be called first, deleting the module, then the MP's dtor will
>  >     be called, which will try to delete the same module again.
>  >
>  > 2a) So either we can prevent the actual destruction of modules that
>  >    are attached to MPs, or
>  >
>  > 2b) Do not do anything in the dtors of MPs (while letting the dtor of
>  >    modules do the work)
>
>  The ModuleProvider itself also needs to be destroyed or you have a
>  memory leak.
>
>
>  > Both options have the disadvantage of assuming the C/C++
>  > implementation (like MP::dtor deletes only the module and nothing
>  > else).
>  >
>  >> The routine LLVMModuleRef LLVMGetGlobalParent(LLVMValueRef Global);
>  >> poses a related problem; in this case, the returned reference is
>  >> non-owning, so you must not dtor it from Python.
>  >
>  > If I do this:
>  >
>  >  m1 = Module.new()
>  >  g1 = m1.add_global_variable(ty, "name")
>  >  m2 = g1.module
>  >
>  > will the LLVMModuleRef pointer returned in the last call be the
>  > same as that of m1? If so probably we can get "g1.module" to
>  > return the original object itself.
>
>  It will, so that's a possibile.
>
>
>  >> The fix, of course, is providing a dispose routine and requiring
>  >> the user to call it, since you can't know what they've done with
>  >> the pointer.
>  >
>  > It'd be much easier to use it without an explicit destruction call.
>  > I'd prefer to do it only if there's absolutely no other go.
>
>
>  It's certainly the easiest option. The thing to remember is that C++
>  won't play with your garbage collector at all.
>
>  The only other option I can see is to
>
>   1. Add an 'owns pointer?' flag to each Python wrapper object.
>   2. Update that flag dynamically as the value is used, in accordance
>  with the C++ ownership semantics.
>   3. Maintain a LLVMFooRef -> Python object table.
>
>  So you would reset the 'owns pointer?' flag on m if the user calls
>  ModuleProvider.new(m).
>
>  The table is necessary, else mp = ModuleProvider.new(m2) [using your
>  example fragment above] would trigger a double free by mp and m1.
>  Python users are probably not performance sensitive to routing lookups
>  through the dictionary, so this might be quite workable.
>
>  But even with all that work, you'll still have the possibility of
>  surprising results:
>
>    def f(m):
>      mp = ModuleProvider.new(m)
>      # do something useful?
>
>    m = Module.new("")
>    f(m)
>    print(m.name)  # Segfault, heap corruption, etc.

Riiight. Let's try again:

1) The MP dtor does a no-op (deletes self, but not the module it owns)

2) The module has a 'provider' attribute, initially None.

3) When a MP is created for a module, it sets module.provider to itself.

4) MP itself has a reference to the module it owns.

Essentially, this creates a mutual reference between the two objects
and both remain active untill *all* references to *both* the objects
go away from the user code.

It does appear that Python collects these properly.

Regards,
-Mahadevan.