On Fri, Apr 1, 2011 at 12:52 PM, Reid Kleckner <span dir="ltr"><<a href="mailto:reid.kleckner@gmail.com">reid.kleckner@gmail.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div class="im">On Fri, Apr 1, 2011 at 2:17 PM, Jay Foad <<a href="mailto:jay.foad@gmail.com">jay.foad@gmail.com</a>> wrote:<br>

> This is very similar to the problem of representing lexical scopes in<br>

> debug info. The llvm.dbg.region.* intrinsics were the wrong way of<br>

> doing it, because of the problems I mentioned above. Now we use<br>

> metadata attached  to each instruction to say what scope it is in,<br>

> which is much better, because it is robust against optimisation<br>

> passes.<br>

<br>

</div>Of course, using metadata isn't acceptable for gc because it can be<br>

dropped, and adding something like it for gc wouldn't be acceptable to<br>

people writing optimizations.<br>

<font color="#888888"><br></font></blockquote><div>That's a good point.</div><div><br></div><div>As far as the use of intrinsics go, nothing says that the internal representation of the gcroot information can't be converted into some more durable form that is used by the LLVM code generator internally. I'm speaking somewhat from ignorance here, but I always assumed that the existing llvm.gcroot() calls got converted into some other form, rather than the backend passes attempting to preserve the actual intrinsic calls (but I could be wrong.)</div>


<div><br></div><div>I understand about wanting the data for garbage collection to be stored per-instruction, however there's some problems with this approach. First, as Reid points out, metadata nodes may be dropped. We can't add additional fields to the instructions themselves, because only a tiny minority of LLVM users make use of the garbage collection features, and the rest don't want to have to pay the memory cost of adding an additional field to every instruction. And storing the information in an auxiliary data structure that is keyed by instruction is risky, because there are so many backend passes which create and destroy instructions, and it would be very difficult to get all of them to keep the garbage collection information up to date as they do their transformations.</div>


<div><br></div><div>Also, I'm not sure that instructions are the right place to be hanging this information - garbage collection is something that is normally associated with a value or a type, and not with an operation. It's a little confusing because in LLVM IR, values and instructions are conflated.</div>


<div><br></div><div>A better scheme might be to associate garbage collection traits with a type. However, with the current structural type system, there's no way to prevent LLVM from merging structurally identical types, which means that your garbage-collectable type might get merged with a non-collectible type. Chris has proposed a way to associate names with types, but it will be a while before that's implemented.</div>


<div><br></div><div>-- </div></div>-- Talin<br>