<div class="gmail_quote"><div>Hi Talin,</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div><div></div></div><div>

Well, I sort of do - I estimate which variant of the union is the largest without knowing its exact size. A long time ago I had hoped to make my frontend generate IR that was completely target-independent, and eventually I had to give up that plan. Unions turned out to be one of the few cases where it's impossible to be target-neutral. However, my frontend is "mostly" target independent in that the only piece of information it currently knows about the target is whether pointers are 32 or 64 bits.</div>


<div><br></div></div></blockquote><div><br></div><div>I still think complete platform independence is something that <i>all</i> systems, including LLVM should aspire to, but (probably) never attain.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div class="gmail_quote"><div></div><div>I should also mention that tagged unions are surprisingly useful in a statically typed language and I would hope that more languages in the future adopt them. A typical example is how Iterators work in my language:</div>


<div><br></div></div></blockquote><div>I completely agree - I included them, albeit in a slightly different form, in my own pet language.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div class="gmail_quote"><div></div><div><div>   } </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">


<div class="gmail_quote">


<div><br></div><div>But if we change it so that the GC only sees pointers, then I'm dead in the water.</div></div></blockquote></div><div><br>In the end, the GC should only be seeing pointers anyway - some of whose "pointer-ness" depends on other values (as in the tagged union).  I think your method could still work with the GC only seeing pointers (albeit with a little modification) - the only requirement I see that your method imposes on the design of a address-space based GC strategy is to maintain information about the structure (union) containing the pointer, next to the pointer.  For this, metadata should work fine.  While it is not particularly elegant, I don't see why you would be "dead in the water" - because it could be made to work.<br>


</div></div></blockquote><div><br></div></div><div>Here's a question - if the only way to identify a root is via pointer address space, then where does the metadata go? Adding a metadata field to the pointer type would also greatly complicate LLVM. My worry is that folks will say "well, since every root is a pointer now, we no longer need the metadata argument to describe it's type."</div>


</div></blockquote><div><br></div><div>I'm not quite sure what you are asking.  I don't know enough about LLVM to tell you how it should be done off the top of my head - but it should just be a matter of attaching the metadata to the instruction that produces the value.  In order for this to work, LLVM would have to include the IR instruction that produced the value in the generated stack-root data.</div>


<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div>I realize that it has drawbacks. I'm mainly just brainstorming.</div></div>


</blockquote><div><br></div><div>Likewise.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div><br></div><div>As far as multiple heaps go: There are two classes of heaps we're talking about. The first class are heaps that have different object lifetime policies. Those kinds of heaps should IMHO be managed entirely by the collector without the need for compiler support. </div>


</div></blockquote><div><br></div><div>Perhaps that was a bad example - but I'm sure there are valid use cases of where you might want to independently track different types of roots.  This is something the address-space solution would handle naturally.</div>


<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote">


<div><br></div><div>The other class of heap is where you have different classes of memory - RAM and ROM, or NUMA-style shared memory spaces - or where the address space represents some semantic difference that the compiler or optimization passes need to be aware of. These heaps may be garbage collected in addition to whatever other special properties they have. So the property of being garbage collected is (at least for me) a single bit, and orthogonal to whether the object is in a special heap or not.</div>


<div><br></div><div>Let me take a step back for a second and think about this thread as a whole. The current LLVM approach to garbage collection requires a division of responsibility between the LLVM libraries and the compilers that call those libraries. As I see it, it's the frontend's job to understand the semantics and structure of data types, and it's LLVM's job to know things like underlying representations and lifetimes of SSA values.</div>


<div><br></div><div>The biggest problem that I have with the current system is that garbage collection roots can only live in allocas, not SSA values, so that I am constantly having to load and store values to memory. The second biggest problem is related to the first - since the scope of an alloca root is the entire function (because calls to llvm.gcroot have to be in the first block) there's no way for me to tell LLVM that a root is confined to a given lexical block, so I have to generate extra code to zero out the root even if it's dead. In fact, most roots get assigned three times - zeroed out at the beginning of the function, then set to a value sometime later, and then set back to zero when I'm done with the value.</div>

</div></blockquote><div><br></div><div>This is something that LLVM should handle completely, independent of which solution is chosen - just propagate the liveness info into the generated stack maps.  Not being very familiar with the internals of LLVM (I've used it a lot, but not hacked on it), I'm not sure how this would work.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote">


<div><br></div><div>What I like about the current system is that the responsibility for interpreting roots is entirely in my hands, and LLVM can treat the entire alloca as an opaque blob if it wants to. In my compiler I treat stack roots exactly like I treat fields within a heap object - the compiler-generated trace tables are exactly the same, except that in the case of stack roots the offsets are negative and the object base address is the frame pointer. The same function that traces the fields of an object also traces the variables on the stack, including handling complex types such as unions and variable length arrays. (Although the latter never occurs on the stack since alloca only takes constant arguments. But I could imagine a SmallVector type situation where you have some number of pointers, only the first N of which have been initialized.)</div>


<div><br></div><div>So when we talk about which solution is more elegant, I think we need to look at the elegance of the whole, not just of the LLVM part.</div></div></blockquote><div><br></div><div>I agree - but when a little effort can get you <i>most</i> of what you (in the general sense) want, IMHO it is almost always preferable to putting tons of work into making it <i>perfect</i>.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div><br></div><div>That being said, I admit that there is one downside to sticking with the current approach going forward, which I will explain. In the current system, a large stack root has the same cost as a small one, since both are access via pointer. However, in a future version of LLVM in which SSA roots are automatically spilled to memory during a safe point and reloaded after, a large root will be more expensive than a small one. This means that there will be pressure to make roots as small as possible, so that if a large struct type has roots confined to just a few fields, it would be more efficient to declare just those fields as roots rather than the entire struct. </div>

</div></blockquote><div><br></div><div>I don't think it is a matter of efficiency - that should be completely up to the spiller making good choices, and the stack maps (perhaps optionally) being smart enough to recognize things in registers.</div>

<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div>So I agree with you this much: It would be nice if there was a way for marking of roots to be finer-grained than an entire SSA value, *including* being able to associate metadata with that finer-grained portion.</div>


<div> </div></div></blockquote><div><br></div><div>The main benefit that the address-space method has is that it should be a big improvement over the current solution and it should (as far as I can tell) be relatively simple to implement.</div>

<div><br></div><div>-Joshua</div></div><br>