[LLVMdev] Accurate garbage collection

Jon Harrop jonathandeanharrop at googlemail.com
Fri Dec 16 12:06:50 PST 2011


David Given wrote:
> Are there any standalone accurate garbage collectors that I could use in my
> project, rather than having to write me own (or use libgc, which is what I'm
> doing now)? Garbage collectors are subtle and very tricky and I really don't want
> to have to do one myself, as I know I'll just get it wrong.

On the contrary, writing your own GC is both easy and enlightening. You'll need:

1. Your own shadow stack: an array of pointers to blocks in the heap (those that are directly reachable).
2. The allocated blocks: another array of pointers to heap blocks (all those that have been allocated).
3. A visit stack: yet another array of pointers to heap blocks (those waiting to be marked).
4. The set of marked heap blocks.

Maintain your own shadow stack by:

1. Pushing values upon
  a) allocation,
  b) reading references from the heap or
  c) obtaining references as the result of a function call.
2. Resetting your shadow stack pointer at all return points of all functions to the value it had upon entry to the function.

At regular intervals, check if the heap size has exceeded its quota and, if so, run a GC cycle. To run a GC cycle:

1. Starting with an empty set of marked heap blocks, mark the global roots (global variables and everything on the shadow stack) and then keep marking heap blocks from the visit stack until it is empty.
2. Do a sliding compaction on the allocated list, calling free() on anything unmarked and copying the rest over the top of it in the allocated list.

To mark a heap block:

1. Add it to the set of marked heap blocks.
2. Push any heap blocks that it references onto the visit stack.

That's it. You can do this in ~100 lines of code.

Cheers,
Jon.






More information about the llvm-dev mailing list