[LLVMdev] getting started with IR needing GC
Gordon Henriksen
gordonhenriksen at mac.com
Sun Apr 27 19:34:56 PDT 2008
On 2008-04-27, at 21:29, Lane Schwartz wrote:
> Hi guys,
Hi Lane!
This is a lot of questions. I'm not going to answer each individually,
but will instead give general guidance to help you avoid the pain
points…
> I somehow need to inform the garbage collection runtime (my
> copycollector.c) about my variables - specifically about gc roots.
> So, after I get new memory using llvm_gc_initialize, I think I
> should generate an @llvm.gcroot intrinsic.
This is correct.
Think of the llvm.gcroot intrinsic call as an annotation on a stack
variable (an alloca). Like noalias, it doesn't generate any code
itself. Rather, it instructs the code generator to keep track of the
location of the variable in the stack frame. You must store your
allocated pointer into this variable. Since you are using a copying
collector, you must also reload the value of this variable before each
use. This is to guard against the possibility that the collector has
been invoked, updating the contents of the alloca.
As for the compiler plugin interface, I suggest you ignore it
initially and use the provided shadow-stack option for the time being.
The shadow stack generates significantly suboptimal code, but will let
you avoid writing some platform-specific code. Instead, simply copy
the llvm_cg_walk_gcroots routine from the semispace example. Call it
from your collection routine in order to visit each gcroot on the
runtime stack.
The shadow stack lets you find roots by automatically emitting a
function prologue and an epilogue to push and pop gcroot entries on a
stack as the program runs. This avoids the need to traverse the native
call stack. If your language becomes sufficiently performance
sensitive, you can later investigate implementing a custom Collector
plugin.
A few additional notes:
> When my frontend generates LLVM IR code for a program, it should also
> generate a call (early in the IR code) to llvm_gc_initialize. This
> function uses the system calloc to allocate two big blocks of memory,
> then stores pointers to that memory in static variables.
> [...]
> There is also a function called llvm_gc_allocate. Now, instead of
> using alloca or malloc, my frontend generates a call to
> llvm_gc_allocate.
These function names are entirely optional. Your runtime can use any
names and prototypes it likes to provide this functionality. A bump-
ptr allocator might easily inline part of its allocation routine.
> Since this is a simple copying collector, the functions llvm_gc_read
> and llvm_gc_write won't really do much:
> void *llvm_gc_read(void *ObjPtr, void **FieldPtr) { return
> *FieldPtr; }
> void llvm_gc_write(void *V, void *ObjPtr, void **FieldPtr)
> { *FieldPtr = V; }
You can just emit loads and stores directly if your read/write
barriers do nothing. Also, there's nothing special about the
llvm_gc_read or llvm_gc_write functions any more; they will not be
called unless you call them yourself.
— Gordon
More information about the llvm-dev
mailing list