[LLVMdev] Garbage collection

Gordon Henriksen gordonhenriksen at me.com
Fri Feb 27 18:52:21 PST 2009


On 2009-02-27, at 19:58, Jon Harrop wrote:

> On Friday 27 February 2009 18:42:13 Gordon Henriksen wrote:
>> I agree this could be better. I think it would be prudent of you,
>> being aware of this problem, to structure your compiler so as to  
>> limit
>> the number of pieces of code which would be effected when you switch
>> to a copying collector.
>
> I think that would make my VM a lot more complicated for no clear  
> practical
> gain.
>
>> I order to address it thoroughly, the LLVM GC infrastructure needs to
>> track register roots so that it doesn't need to conservatively reload
>> from the stack so frequently.
>
> F# suffers because null references are typeless on .NET but you want  
> to use
> typed null references to represent the empty list and the None  
> option type
> constructor and many other things. The F# team cannot fix their VM  
> so they
> were forced to settle on an inefficient implementation of lists in a  
> language
> where lists are ubiquitous and an implementation of the option type  
> that is
> efficient but does not pretty print correctly due to insufficient type
> information.
>
> I can and did fix my VM by representing reference types as an  
> aggregate
> containing a pointer to shared data (the type) and a pointer to  
> individual
> data (the value). Thus you can have a typed null pointer by having a  
> pointer
> to a valid type and a null pointer for the individual data.
>
> However, that means my roots are now aggregate values. So your  
> proposed GC
> infrastructure for LLVM would need to be able to mark aggregate  
> values as
> roots as well as individual values. Moreover, even if we went  
> through all
> this work of improving LLVM to support this GC infrastructure and  
> altering my
> VM to traverse the system stack instead of my current shadow stack,  
> I am not
> even convinced it would be a significant improvement over what I  
> already
> have.

I agree you likely won't see any significant overhead reduction vs.  
your current implementation unless you adopt a copying collector.

>> This would likely entail a change to the
>> IR (either a 'GC' address space as Chris suggests, a new intrinsic to
>> 'tag' a value as a GC pointer, or even a change to the Type
>> hierarchy)
>
> Can you elaborate on a changing of the type hierarchy?

I've considered introducing a separate GCPointer type.

>> --another reason to isolate GC-handling code in your compiler.
>
> I just cannot see how the GC-handling code can be isolated in such a  
> way that
> the result is better than just having separate HLVMs for separate  
> design
> decisions.


I simply suggest you group functions to 'emit read barrier', 'allocate  
GC root', and soforth together in your compiler such that they can be  
easily changed or updated in one place if it proves necessary.

— Gordon





More information about the llvm-dev mailing list