On Thu, Jul 7, 2011 at 1:35 PM, Anderson, Todd A <span dir="ltr"><<a href="mailto:todd.a.anderson@intel.com">todd.a.anderson@intel.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div lang="EN-US" link="blue" vlink="purple"><div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p><div><div class="im"><p class="MsoNormal" style="margin-left:.5in">On Thu, Jul 7, 2011 at 10:55 AM, Anderson, Todd A <<a href="mailto:todd.a.anderson@intel.com" target="_blank">todd.a.anderson@intel.com</a>> wrote:<u></u><u></u></p>


<div><div><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;color:#1F497D">For the past few years, my group in Intel Labs has been working on a project similar to LLVM and C--, and perhaps our experience in handling roots and stack walking could be useful in deciding how LLVM should evolve in the GC area.  Our project is called Pillar (you can see our paper "Pillar: A Parallel Implementation Language" in Languages and Compilers for Parallel Computing 2008 for a general overview).  Pillar is a generic language and runtime designed to support managed languages and parallel languages although most of our work so far has been on the managed side.  We've implemented a JVM and a functional language on top of Pillar so it has evidenced some generality.</span><u></u><u></u></p>


<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;color:#1F497D"> </span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;color:#1F497D">In the Pillar language, we have a generalized ref (similar to llvm.gcroot) data type that encompases different kinds of pointers and types into the managed heap.  This could be regular pointers, a couple varieties of interior pointers, or user-defined pointer types.  The underlying compiler then generates per-callsite stack maps.  Those stack maps can indicate when a live ref (local or parameter) is at a known offset from the start of the frame and can also indicate when a live ref is present in a register.</span><u></u><u></u></p>


</div></div><div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p></div></div><div><div class="im"><p class="MsoNormal" style="margin-left:.5in">I assume that by interior pointer you mean a pointer to the inside of another object?<u></u><u></u></p>


</div><p class="MsoNormal"><b><i><span style="font-size:11.0pt;color:#1F497D">[TAA] <u></u><u></u></span></i></b></p><p class="MsoNormal"><b><i><span style="font-size:11.0pt;color:#1F497D">Yes.</span></i></b><span style="font-size:11.0pt;color:#1F497D"><u></u><u></u></span></p>


</div><div class="im"><blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in"><div><div><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;color:#1F497D">If we want to enumerate the refs of a call stack, we call a runtime function, prtYoungestActivation, and that gives us back a stack iterator for the sequence of activations.  The iterator contains the value for esp and pointers to the memory locations where ebp, eip, edi, esi, and ebx were last saved to the stack.  (For architectures other than 32-bit x86, the register-specific contents of the activation would of course be defined differently.)  In the eip case, this is saved for every activation as part of the call sequence and so provides you with the ability to determine which call site you are at.  The other registers may have been saved by the current frame or a subsequent frame.  The stack iterator also contains a virtual frame number that basically allows an unwinder to see logical activations instead of physical activations that may combine multiple logical activations via inlining.</span><u></u><u></u></p>


<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;color:#1F497D"> </span><u></u><u></u></p></div></div></blockquote><div><p class="MsoNormal" style="margin-left:.5in">That all makes sense.<u></u><u></u></p>


</div><div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p></div><div><p class="MsoNormal" style="margin-left:.5in">I would say that your framework is at a somewhat higher level than I would expect LLVM to support - by that, I mean that ideally it should be possible to use the primitives that LLVM supplies to build exactly what you have described. My own use case has slightly different requirements, and as a result I could not adopt your framework as it is - but I see no reason why I should not be able to use the same LLVM primitives to build what I need.<u></u><u></u></p>


</div><div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p></div></div><div><div class="im"><p class="MsoNormal" style="margin-left:.5in">The main difference has to do with the handling of disjoint types in my language. For example, if I declare a variable using the following syntax:<u></u><u></u></p>


</div><p class="MsoNormal"><b><i><span style="font-size:11.0pt;color:#1F497D">[TAA] <u></u><u></u></span></i></b></p><p class="MsoNormal"><b><i><span style="font-size:11.0pt;color:#1F497D">I’m not so sure we couldn’t support your case.  If a disjoint type might contain a root then you could define the possible pointer in the struct as a generalized ref having a user-defined type and extraneous data.  That user-defined generalized ref type for you would indicate a disjoint union and the extraneous data could indicate the address of the discriminator in the union.  Whether the data was enregistered or wouldn’t matter in that case.  The GC root callback is passed the user-defined type and extraneous data you declare the field with.<u></u><u></u></span></i></b></p>


<p class="MsoNormal"><b><i><span style="font-size:11.0pt;color:#1F497D"><u></u></span></i></b></p></div></div></div></div></blockquote><div><br></div><div>Interesting. Let me think about that. It would indeed be nice if that worked. How would you handle a case where the union member was (String *)[3] instead of just String *?:</div>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div lang="EN-US" link="blue" vlink="purple"><div><div><div><p class="MsoNormal"><b><i><span style="font-size:11.0pt;color:#1F497D"> <u></u></span></i></b></p>


<p class="MsoNormal"><b><i><span style="font-size:11.0pt;color:#1F497D">I agree that much of this is at a higher level that LLVM should not have to worry about but having such a generalized ref type in the core seems important so you can handle ref params and refs in registers.  You can define it as data structures in the core and implement a library on top.</span></i></b><span style="font-size:11.0pt;color:#1F497D"><u></u><u></u></span></p>


</div><div><p class="MsoNormal" style="margin-left:.5in"><u></u> </p></div></div></div></div></blockquote><div>I'm in agreement here. </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div lang="EN-US" link="blue" vlink="purple"><div><div><div><p class="MsoNormal" style="margin-left:.5in"><u></u></p></div><div><p class="MsoNormal" style="margin-left:.5in"> <u></u><u></u></p><div><div class="im"><div><p class="MsoNormal" style="margin-left:.5in">


<span style="font-size:11.0pt;color:#1F497D">You can then call another runtime function, prtEnumerateRootsOfActivation, to ask for the roots associated with the given stack iterator's current activation to be enumerated.  To this function, you pass a PrtRseInfo struct that contains a callback that gets invoked once per live ref found and a opaque environment variable passed (unexamined) onto the callback function.  To go to the next activation on the call stack, you then call the runtime's prtNextActivation.  There are routines to determine when you've got to the bottom of the call stack as well.  In addition, there are routines for performing other (non-GC) operations the stack iterator's activation, such as looking up data associated with the SPAN construct (taken from C--) or exception handlers.</span><u></u><u></u></p>


<p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p><p class="MsoNormal" style="margin-left:.5in">In my own case, I don't have an iterator for roots within a single activation, rather I have a static data structure that describes how to find all of the roots within the call frame. One of my design goals is to minimize the number of function calls required to trace a pointer, so the different parts of my code accept whole stack maps as arguments instead of single pointers.<u></u><u></u></p>


</div><div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p></div></div><div><div class="im"><p class="MsoNormal" style="margin-left:.5in">However, the LLVM approach to generating stack maps is flexible enough to handle either case - your GCStrategy plugin gets a list of all stack roots for each safe point, and you can transform that into whatever data structures you need.<u></u><u></u></p>


</div><p class="MsoNormal"><b><i><span style="font-size:11.0pt;color:#1F497D">[TAA] <u></u><u></u></span></i></b></p><p class="MsoNormal"><b><i><span style="font-size:11.0pt;color:#1F497D">They seem interchangeable…if you have a data structure you can just as easily write an iterator.  I’m curious if you see this as performance critical.  In my experience, GC times tend to be completely dominated by walking the live objects and not root set enumeration.</span></i></b><span style="font-size:11.0pt;color:#1F497D"><u></u><u></u></span></p>


</div></div></div></div></div></div></blockquote></div><br>Well, I've always been a premature optimizer :) However, I use the same data structure type to describe both stack roots and heap objects - essentially I treat the stack frame as though it were another object (except that the offsets are usually negative).<div>


<div><div><br></div><div>Oh, one other minor misunderstanding I wanted to clear up: the "llvm.gcroot" intrinsic, as it exists today, is used to mark *values*, not types, as being roots. One of the things I am proposing is to make it so that the marking of roots is type-based rather than value-based. This would, among other benefits, allow function parameters to be roots.</div>


<div><br>-- <br>-- Talin<br>

</div></div></div>