[LLVMdev] Garbage Collection Project
Jon Harrop
jon at ffconsultancy.com
Thu Jun 18 06:04:08 PDT 2009
On Thursday 18 June 2009 11:14:35 Granville Barnett wrote:
> > Firstly, rather than using a single 1 word pointer to represent a
> > reference I
> > chose to use 3 words including a pointer to the type and a pointer to the
> > value (as well as metadata). This allows typed nulls and that addresses
> > an important deficiency found in most other VMs including the CLR. Is
> > Scarcity able to handle such references or does its implementation of
> > stack frames require references to be a single word?
>
> Three words sounds pretty expensive to me, I can see the use of an extra
> word for typed nulls.
The word is not really "extra" because I just moved the header out of the
value and into the reference. For example, the three words for an array are a
pointer to the type, the array length and a pointer to the (C-compatible)
array data. In addition to providing typed nulls, this has the added
advantage of simplifying interop.
> If you look at something like the CLR you will see
> that you have very fast access to an objects type, after all when you have
> a reference to an object what you really have is a reference to an objects'
> object header. From there you can access an objects syncblock (if it has
> one, which is before the obj header) and the objects' method table which
> includes the relevant pointers to the objects' type among other things. It
> simply means you do a few hops, each of which is a constant operation
> anyway.
That is similar to the approach I used, although HLVM provides a pointer
directly to the type, saving you a single hop.
> Maybe I'm missing something key about the language you are implementing the
> GC for, also is it really necessary to use an extra word for null types?
I would not have considered it had I not seen the damage done to F# by the
CLR's typeless nulls. You want an unallocated constant for many different
purposes in F#, such as representing the empty option type None, the empty
list [] and maybe the unit value (). Unfortunately, using "null" means you
don't get any type information and that means that none of those values work
correctly with anything requiring run-time type information such as generic
printing, serialization and so on. The only alternative is to use an
allocated value but the allocator and (concurrent) GC are very slow so that
gives you the functionality you want but only at a grave cost in terms of
performance.
Consequently, they chose to represent the empty list with an allocated value
(so [] prints correctly) but the option type uses null. Hence printf "%A"
None prints "<null>". They've also used other tricks like having an internal
set representation that uses nulls but is wrapped in another representation
that handles them correctly but only at the cost of an extra level of
indirection.
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
More information about the llvm-dev
mailing list