[LLVMdev] Proposal for improving llvm.gcroot (summarized)

Fri Apr 1 14:14:18 PDT 2011

On Fri, Apr 1, 2011 at 12:52 PM, Reid Kleckner <reid.kleckner at gmail.com>wrote:

> On Fri, Apr 1, 2011 at 2:17 PM, Jay Foad <jay.foad at gmail.com> wrote:
> > This is very similar to the problem of representing lexical scopes in
> > debug info. The llvm.dbg.region.* intrinsics were the wrong way of
> > doing it, because of the problems I mentioned above. Now we use
> > metadata attached  to each instruction to say what scope it is in,
> > which is much better, because it is robust against optimisation
> > passes.
>
> Of course, using metadata isn't acceptable for gc because it can be
> dropped, and adding something like it for gc wouldn't be acceptable to
> people writing optimizations.
>
> That's a good point.

As far as the use of intrinsics go, nothing says that the internal
representation of the gcroot information can't be converted into some more
durable form that is used by the LLVM code generator internally. I'm
speaking somewhat from ignorance here, but I always assumed that the
existing llvm.gcroot() calls got converted into some other form, rather than
the backend passes attempting to preserve the actual intrinsic calls (but I
could be wrong.)

I understand about wanting the data for garbage collection to be stored
per-instruction, however there's some problems with this approach. First, as
Reid points out, metadata nodes may be dropped. We can't add additional
fields to the instructions themselves, because only a tiny minority of LLVM
users make use of the garbage collection features, and the rest don't want
to have to pay the memory cost of adding an additional field to every
instruction. And storing the information in an auxiliary data structure that
is keyed by instruction is risky, because there are so many backend passes
which create and destroy instructions, and it would be very difficult to get
all of them to keep the garbage collection information up to date as they do
their transformations.

Also, I'm not sure that instructions are the right place to be hanging this
information - garbage collection is something that is normally associated
with a value or a type, and not with an operation. It's a little confusing
because in LLVM IR, values and instructions are conflated.

A better scheme might be to associate garbage collection traits with a type.
However, with the current structural type system, there's no way to prevent
LLVM from merging structurally identical types, which means that your
garbage-collectable type might get merged with a non-collectible type. Chris
has proposed a way to associate names with types, but it will be a while
before that's implemented.

-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110401/16d2f907/attachment.html>