[LLVMdev] Why LLVM should NOT have garbage collection intrinsics
Mark Shannon
marks at dcs.gla.ac.uk
Sun Mar 1 02:41:39 PST 2009
Gordon Henriksen wrote:
>
> The "runtime interface" is a historical artifact. LLVM does not impose
> a runtime library on its users. I wouldn't have a problem deleting all
> mention of it, since LLVM does not impose a contract on the runtime.
>
Excellent, I found it somewhat unhelpful!
>> The semantics of llvm.gcroot are vague:
>> "At compile-time, the code generator generates information to allow
>> the
>> runtime to find the pointer at GC safe points."
>>
>> Vague, ill-specified interfaces are worse than none.
>
> There's nothing ill-defined about the semantics of gcroot except
> insofar as GC code generation is pluggable.
>
Sorry, but "At compile-time, the code generator generates information to
allow the runtime to find the pointer at GC safe points." does not
really say anything.
No one could possibly implement this "specification".
Sorry about all my negative comments, but I would like to implement a
generational collector for llvm, but I cannot do so in a portable way.
So, here is a suggestion:
Call the GC 'intrinsics' something else, "extinsics"?, and provide
low-level intrinsics so that the GC calls, gcroot, gcread and gcwrite
can be converted to GC-free LLVM code in a GC-lowering pass.
IR+GC -> | GC Lowering pass | -> IR
Rather than than the current.
IR+GC -> | Backend lowering pass(es) | -> SelectionDAG
Read and write barriers can already be written in llvm-IR.
It is the marking of roots that is the problem.
Given that any new intrinsics/instructions are an additional burden on
all back-ends, I'm not going to propose particular ones, but it seems
that they are needed.
By the way, I think that adding a GC pointer type is an unnecessary
burden on the the back-ends, front-ends really should be able to handle
this.
The current trio of gcroot, gcread and gcwrite is OK, BUT GC
implementations should be able to translate them to llvm-IR so that the
optimisers and back-ends can do their jobs without worrying about GC
details.
As an aside, I think that debug info can be treated in a similar way:
IR+debug -> | Debug lowering pass | -> IR
After all both debug and GC require similar things, that is, information
about the location of stack variables (and possibly, register variables)
and the machine location of points in code (for line numbering or
gc-safe points).
If intrinsics/instructions to do the above can be implemented then I
will port my generational, copying collector to LLVM *and* maintain it
for as long as possible.
Mark.
More information about the llvm-dev
mailing list