[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.

nicolas geoffray nicolas.geoffray at gmail.com
Sat Sep 25 23:37:31 PDT 2010


Yes, it's definitely OK. In the future, I think the verifier will also be
changed to support non-allocas in llvm.gcroot.

Nicolas

On Sat, Sep 25, 2010 at 11:51 PM, Talin <viridia at gmail.com> wrote:

> On Sat, Sep 25, 2010 at 10:51 AM, nicolas geoffray <
> nicolas.geoffray at gmail.com> wrote:
>
>> I didn't have unions in mind - indeed you need some kind of static
>> information in such a case. The GC infrastructure in LLVM having so little
>> love, I think it is good if you can improve it in any ways, as well as
>> defining new interfaces.
>
>
> So the patch is OK then? All it does is change the verifier -- llvm.gcroot
> already has the ability to do this, its just that the verifier wouldn't
> allow it.
>
>>
>> Cheers,
>> Nicolas
>>
>>
>> On Sat, Sep 25, 2010 at 6:38 PM, Talin <viridia at gmail.com> wrote:
>>
>>> On Sat, Sep 25, 2010 at 1:04 AM, nicolas geoffray <
>>> nicolas.geoffray at gmail.com> wrote:
>>>
>>>> Hi Talin,
>>>>
>>>> On Sat, Sep 25, 2010 at 4:18 AM, Talin <viridia at gmail.com> wrote:
>>>>>
>>>>>
>>>>> Many languages support the notion of a "value type". Value types are
>>>>> always passed by value, unlike reference types which are always passed by
>>>>> pointer. An example is the "struct" type in C#. Another example is a "tuple"
>>>>> type. A value type which is a local variable lives on the stack as an
>>>>> alloca, not on the heap. When a function is called with a value type as
>>>>> argument, the callee gets its own copy of the argument, rather than sharing
>>>>> a pointer with the caller.
>>>>>
>>>>
>>>> Yes.
>>>>
>>>>
>>>>>
>>>>> Value types are represented in LLVM using structs, and may contain
>>>>> pointer fields which need to be traced.
>>>>>
>>>>>
>>>> Yes.
>>>>
>>>>
>>>>> The way that I handle non-pointer types is to generate an array of
>>>>> field offsets (containing the offset of each pointer field within the
>>>>> struct) as the metadata argument to llvm.gcroot. This meta argument is then
>>>>> processed in my GCStrategy, where I add the stack root offset to the offsets
>>>>> in the field offset array, which yields the stack offsets of the actual
>>>>> pointers in the call frame.
>>>>>
>>>>>
>>>>
>>>> Did you think of the alternative of calling llvm.gcroot on pointers in
>>>> this struct? This requires to change the verifier to support non-alloca
>>>> pointers in llvm.gcroot, but it makes the solution more general and cleaner:
>>>> pointers given to llvm.gcroot only point to objects in the heap.
>>>>
>>>> I think that, originally, the purpose of the second argument of
>>>> llvm.gcroot was to emit static type information.
>>>>
>>>
>>> Let me give you a more complicated example to see why this won't work:
>>>
>>> Imagine I have a discriminated union type, whose type declaration looks
>>> like this:
>>>
>>>    var x:int or String.
>>>
>>> The variable 'x' can be either an integer or a reference to a string
>>> object. In LLVM assembly, this data structure is represented by the
>>> following struct:
>>>
>>>    { i1, String * }
>>>
>>> The 'i1' field (the 'disciminator') is used to determine what kind of
>>> value is currently stored in the union. If it's 0, then it's an int, and the
>>> structure will be cast to { i8, int } before extracting the value. If it's
>>> 1, then it's a String pointer. The compiler does not allow access to the
>>> wrong type - if the value it 0, the language does not allow you to extract
>>> the value as a String.
>>>
>>> Now, suppose we declare this as a local variable, so the union struct is
>>> contained within an alloca. We want to declare the String pointer as a root,
>>> but only if the discriminator is not 0. We can't determine this at compile
>>> time, instead the collector has to be smart enough to examine the union and
>>> determine whether it contains a pointer or not.
>>>
>>> In my compiler, what I do is to generate a callback function that can
>>> trace the object. This callback function is contained within a data
>>> structure that is passed as the metadata argument to llvm.gcroot.
>>>
>>> So my code looks like this (bit casts omitted for simplicity):
>>>
>>>     %int_or_string = type { i8, String * }
>>>     %x = alloca %int_or_string
>>>     call void llvm.gcroot( i8 ** x, i8* @.tracetable.int_or_string)
>>>
>>> Where '.tracetable.int_or_string' is the static type information for the
>>> "int or string" type, containing both the field offsets and the callback
>>> function to test the value of the disciminator.
>>>
>>> Note that if I only declared the pointer as a root, then this wouldn't
>>> work - the collector needs access to the entire data structure in order to
>>> trace the object correctly.
>>>
>>> Also, I think this is the right solution - llvm.gcroot is only
>>> responsible for the offset of the base of the alloca, not for any of it's
>>> internal structure, which is the responsibility of the compiler and the
>>> GCStrategy.
>>>
>>>
>>>> Nicolas
>>>>
>>>>
>>>>
>>>>> It's all pretty simple really.
>>>>>
>>>>>
>>>>>>
>>>>>> Nicolas
>>>>>>
>>>>>>  On Fri, Sep 24, 2010 at 7:00 PM, Chris Lattner <clattner at apple.com>wrote:
>>>>>>
>>>>>>>  On Sep 22, 2010, at 8:52 AM, Talin wrote:
>>>>>>> > I'm moving this thread to llvm-dev in the hopes of reaching a wider
>>>>>>> audience.
>>>>>>> >
>>>>>>> > This patch relaxes the restriction on llvm.gcroot so that it can
>>>>>>> work with non-pointer allocas. The only changes are to Verifier.cpp - it
>>>>>>> appears from my testing that llvm.gcroot always worked fine with non-pointer
>>>>>>> allocas, except that the verifier wouldn't allow it. I've used this patch to
>>>>>>> build an efficient stack crawler (an alternative to shadow-stack that uses
>>>>>>> only static constant data structures.)
>>>>>>> >
>>>>>>> > Here's a deal: If you accept this patch, I'll write up an extensive
>>>>>>> tutorial on how to write a stack crawler like mine. (Actually, it's already
>>>>>>> written, however without this patch the tutorial doesn't make any sense.)
>>>>>>>
>>>>>>> Hi Talin,
>>>>>>>
>>>>>>> I don't think anyone is really using the GC support, other than
>>>>>>> Nicolas in VMKit.  If he's ok with the change, I am too.  Please make sure
>>>>>>> the dox stay up to date though.
>>>>>>>
>>>>>>> -Chris
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -- Talin
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -- Talin
>>>
>>
>>
>
>
> --
> -- Talin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100926/7f8e047f/attachment.html>


More information about the llvm-dev mailing list