[LLVMdev] llvm.gcroot suggestion

Talin viridia at gmail.com
Mon Mar 7 09:35:51 PST 2011


On Mon, Mar 7, 2011 at 4:08 AM, nicolas geoffray <nicolas.geoffray at gmail.com
> wrote:

> Hi Talin,
>
> On Sat, Mar 5, 2011 at 6:42 PM, Talin <viridia at gmail.com> wrote:
>>
>>
>> So I've been thinking about your proposal, that of using a special address
>> space to indicate garbage collection roots instead of intrinsics.
>
>
>  Great!
>
>
>>
>> To address this, we need a better way of telling LLVM that a given
>> variable is no longer a root.
>>
>
> Live variable analysis is already in LLVM and for me that's enough to know
> whether a given variable is no longer a root. Note that each safe point has
> its own set of root locations, and these locations all contain live
> variables. Dead variables may still be in register or stack, but the GC will
> not visit them.
>
>
>> 2) As I mentioned, my language supports tagged unions and other "value"
>> types. Another example is a tuple type, such as (String, String). Such types
>> are never allocated on the heap by themselves, because they don't have the
>> object header structure that holds the type information needed by the
>> garbage collector. Instead, these values can live in SSA variables, or in
>> allocas, or they can be embedded inside larger types which do live on the
>> heap.
>>
>
> If you know, at compile-time, whether you are dealing with a struct or a
> heap, what prevents you from emitting code that won't need such tagged
> unions in the IR. Same for structs: if they contain pointers to heap
> objects, those will be in that special address space.
>

I'm not sure what you mean by this.

Take for example a union of a String (which is a pointer) and a float. The
union is either { i1; String * } or { i1; float }. The garbage collector
needs to see that i1 in order to know whether the second field of the struct
is a pointer - if it attempted to dereference the pointer when the field
actually contains a float, the program would crash. The metadata argument
that I pass to llvm.gcroot informs the garbage collector about the structure
of the union.

>
> 3) I've been following the discussions on llvm-dev about the use of the
>> address-space property of pointers to signal different kinds of memory pools
>> for things like shared address spaces. If we try to use that same variable
>> to indicate garbage collection, now we have to multiplex both meanings onto
>> the same field. We can't just dedicate one special ID for the garbage
>> collected heap, because there could be multiple such heaps. As you add
>> additional orthogonal meanings to the address-space field, you end up with a
>> combinatorial explosion of possible values for it.
>>
>>
> I think there exist already some convention between an ID and some codegen.
> Having one additional seems fine to me, even if you need to play with bits
> in case you need different IDs for a single pointer.
>
> I'm also fine with the intrinsic way of declaring a GC root. But I think it
> is cumbersome, and error-prone in the presence of optimizers that may try to
> move away that intrinsic (I remember similar issues with the current EH
> intrinsics).
>
> Nicolas
>
>
>> --
>> -- Talin
>>
>
>


-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110307/e32535e9/attachment.html>


More information about the llvm-dev mailing list