Hi Talin,<br><br><div class="gmail_quote">On Sat, Mar 5, 2011 at 6:42 PM, Talin <span dir="ltr"><<a href="mailto:viridia@gmail.com">viridia@gmail.com</a>></span> wrote:<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div><div class="h5"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

</blockquote></div><div><br></div></div></div>So I've been thinking about your proposal, that of using a special address space to indicate garbage collection roots instead of intrinsics.</blockquote><div><br></div><div>

Great!</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><br></div><div>To address this, we need a better way of telling LLVM that a given variable is no longer a root. </div>

</blockquote><div><br></div><div>Live variable analysis is already in LLVM and for me that's enough to know whether a given variable is no longer a root. Note that each safe point has its own set of root locations, and these locations all contain live variables. Dead variables may still be in register or stack, but the GC will not visit them.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div>2) As I mentioned, my language supports tagged unions and other "value" types. Another example is a tuple type, such as (String, String). Such types are never allocated on the heap by themselves, because they don't have the object header structure that holds the type information needed by the garbage collector. Instead, these values can live in SSA variables, or in allocas, or they can be embedded inside larger types which do live on the heap.</div>

</blockquote><div><br></div><div>If you know, at compile-time, whether you are dealing with a struct or a heap, what prevents you from emitting code that won't need such tagged unions in the IR. Same for structs: if they contain pointers to heap objects, those will be in that special address space.</div>

<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div>3) I've been following the discussions on llvm-dev about the use of the address-space property of pointers to signal different kinds of memory pools for things like shared address spaces. If we try to use that same variable to indicate garbage collection, now we have to multiplex both meanings onto the same field. We can't just dedicate one special ID for the garbage collected heap, because there could be multiple such heaps. As you add additional orthogonal meanings to the address-space field, you end up with a combinatorial explosion of possible values for it.</div>


<div><br></div></blockquote><div><br></div><div>I think there exist already some convention between an ID and some codegen. Having one additional seems fine to me, even if you need to play with bits in case you need different IDs for a single pointer.<br>

</div><div><br></div><div>I'm also fine with the intrinsic way of declaring a GC root. But I think it is cumbersome, and error-prone in the presence of optimizers that may try to move away that intrinsic (I remember similar issues with the current EH intrinsics).</div>

<div><br></div><div>Nicolas</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div></div><div>-- </div><div>-- Talin<br>

</div>

</blockquote></div><br>