[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.

nicolas geoffray nicolas.geoffray at gmail.com
Sat Sep 25 10:51:12 PDT 2010


I didn't have unions in mind - indeed you need some kind of static
information in such a case. The GC infrastructure in LLVM having so little
love, I think it is good if you can improve it in any ways, as well as
defining new interfaces.

Cheers,
Nicolas

On Sat, Sep 25, 2010 at 6:38 PM, Talin <viridia at gmail.com> wrote:

> On Sat, Sep 25, 2010 at 1:04 AM, nicolas geoffray <
> nicolas.geoffray at gmail.com> wrote:
>
>> Hi Talin,
>>
>> On Sat, Sep 25, 2010 at 4:18 AM, Talin <viridia at gmail.com> wrote:
>>>
>>>
>>> Many languages support the notion of a "value type". Value types are
>>> always passed by value, unlike reference types which are always passed by
>>> pointer. An example is the "struct" type in C#. Another example is a "tuple"
>>> type. A value type which is a local variable lives on the stack as an
>>> alloca, not on the heap. When a function is called with a value type as
>>> argument, the callee gets its own copy of the argument, rather than sharing
>>> a pointer with the caller.
>>>
>>
>> Yes.
>>
>>
>>>
>>> Value types are represented in LLVM using structs, and may contain
>>> pointer fields which need to be traced.
>>>
>>>
>> Yes.
>>
>>
>>> The way that I handle non-pointer types is to generate an array of field
>>> offsets (containing the offset of each pointer field within the struct) as
>>> the metadata argument to llvm.gcroot. This meta argument is then processed
>>> in my GCStrategy, where I add the stack root offset to the offsets in the
>>> field offset array, which yields the stack offsets of the actual pointers in
>>> the call frame.
>>>
>>>
>>
>> Did you think of the alternative of calling llvm.gcroot on pointers in
>> this struct? This requires to change the verifier to support non-alloca
>> pointers in llvm.gcroot, but it makes the solution more general and cleaner:
>> pointers given to llvm.gcroot only point to objects in the heap.
>>
>> I think that, originally, the purpose of the second argument of
>> llvm.gcroot was to emit static type information.
>>
>
> Let me give you a more complicated example to see why this won't work:
>
> Imagine I have a discriminated union type, whose type declaration looks
> like this:
>
>    var x:int or String.
>
> The variable 'x' can be either an integer or a reference to a string
> object. In LLVM assembly, this data structure is represented by the
> following struct:
>
>    { i1, String * }
>
> The 'i1' field (the 'disciminator') is used to determine what kind of value
> is currently stored in the union. If it's 0, then it's an int, and the
> structure will be cast to { i8, int } before extracting the value. If it's
> 1, then it's a String pointer. The compiler does not allow access to the
> wrong type - if the value it 0, the language does not allow you to extract
> the value as a String.
>
> Now, suppose we declare this as a local variable, so the union struct is
> contained within an alloca. We want to declare the String pointer as a root,
> but only if the discriminator is not 0. We can't determine this at compile
> time, instead the collector has to be smart enough to examine the union and
> determine whether it contains a pointer or not.
>
> In my compiler, what I do is to generate a callback function that can trace
> the object. This callback function is contained within a data structure that
> is passed as the metadata argument to llvm.gcroot.
>
> So my code looks like this (bit casts omitted for simplicity):
>
>     %int_or_string = type { i8, String * }
>     %x = alloca %int_or_string
>     call void llvm.gcroot( i8 ** x, i8* @.tracetable.int_or_string)
>
> Where '.tracetable.int_or_string' is the static type information for the
> "int or string" type, containing both the field offsets and the callback
> function to test the value of the disciminator.
>
> Note that if I only declared the pointer as a root, then this wouldn't work
> - the collector needs access to the entire data structure in order to trace
> the object correctly.
>
> Also, I think this is the right solution - llvm.gcroot is only responsible
> for the offset of the base of the alloca, not for any of it's internal
> structure, which is the responsibility of the compiler and the GCStrategy.
>
>
>> Nicolas
>>
>>
>>
>>> It's all pretty simple really.
>>>
>>>
>>>>
>>>> Nicolas
>>>>
>>>>  On Fri, Sep 24, 2010 at 7:00 PM, Chris Lattner <clattner at apple.com>wrote:
>>>>
>>>>>  On Sep 22, 2010, at 8:52 AM, Talin wrote:
>>>>> > I'm moving this thread to llvm-dev in the hopes of reaching a wider
>>>>> audience.
>>>>> >
>>>>> > This patch relaxes the restriction on llvm.gcroot so that it can work
>>>>> with non-pointer allocas. The only changes are to Verifier.cpp - it appears
>>>>> from my testing that llvm.gcroot always worked fine with non-pointer
>>>>> allocas, except that the verifier wouldn't allow it. I've used this patch to
>>>>> build an efficient stack crawler (an alternative to shadow-stack that uses
>>>>> only static constant data structures.)
>>>>> >
>>>>> > Here's a deal: If you accept this patch, I'll write up an extensive
>>>>> tutorial on how to write a stack crawler like mine. (Actually, it's already
>>>>> written, however without this patch the tutorial doesn't make any sense.)
>>>>>
>>>>> Hi Talin,
>>>>>
>>>>> I don't think anyone is really using the GC support, other than Nicolas
>>>>> in VMKit.  If he's ok with the change, I am too.  Please make sure the dox
>>>>> stay up to date though.
>>>>>
>>>>> -Chris
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -- Talin
>>>
>>
>>
>
>
> --
> -- Talin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100925/a4f63cd6/attachment.html>


More information about the llvm-dev mailing list