<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div><br></div><div><br>On Oct 26, 2013, at 12:37 AM, Michael Lewis <<a href="mailto:don.apoch@gmail.com">don.apoch@gmail.com</a>> wrote:<br><br></div><blockquote type="cite"><div><div dir="ltr">I'm also highly interested in relocating-GC support from LLVM. Up until now my GC implementation has been non-relocating which is obviously kind of a bummer given that it inhibits certain classes of memory allocation/deallocation tricks.</div></div></blockquote><div><br></div><div>You can implement a copying GC (what the kids these days call relocating) without accurate roots. </div><br><blockquote type="cite"><div><div dir="ltr"><div>

<br></div><div><br></div><div>I wrote up a bunch of my findings on the implementation of my GC here: <a href="https://code.google.com/p/epoch-language/wiki/GarbageCollectionScheme">https://code.google.com/p/epoch-language/wiki/GarbageCollectionScheme</a></div>

<div><br></div></div></div></blockquote><div><br></div><div>Why aren't you just using the well-known Bartlett style of GC, which allows for relocation even in the presence of conservative roots (or accurate roots that don't allow relocation)?</div><br><blockquote type="cite"><div><div dir="ltr"><div><br></div><div>Frankly I haven't had time to get much further than idle pondering of how I'd go about implementing relocation, but it seems to me like the existing read/write barrier intrinsics might be sufficient if abused properly by the front-end and lowered carefully. My operating hypothesis has been to stash barriers at key points - identified by the front-end itself - and possibly elide them during lowering if it's safe to do so. If my understanding is correct, it should be possible to lower the barriers into exactly the kind of boxing/unboxing procedure described in this thread.</div>

<div><br></div><div>Based on my experimentation so far, which is admittedly minimal, I think the overhead of updating relocated pointers is actually not as bad as it sounds. It isn't strictly necessary to store both a boxed *and* unboxed pointer in every frame. In fact, in the current incarnation of gcroot, marking an alloca as a gcroot effectively forces a stack spill for that alloca; this means that the relocating collector just updates the single existing pointer on the stack when it does its magic, and you're done. With proper barriers in place, and/or careful location of safepoints by the front-end, it's not that hard to make sure that a relocated pointer gets updated.</div></div></div></blockquote><div><br></div><div>Bartlett collectors will give you relocation with zero overhead. You won't even need any help from LLVM to do it. </div><br><blockquote type="cite"><div><div dir="ltr">

<div><br></div><div>The real trick, as near as I can tell, would be updating registers that have live roots during a collection. But again based on my past investigations this should just be a matter of ensuring that post-collection you have a barrier that is lowered into a register refresh based on the on-stack relocated pointer value.</div>

<div><br></div><div><br></div><div>One thing I've been meaning to try and do is use this barrier-abuse scheme to work around the existing lack of support for tracing roots in machine registers, by effectively setting up an artificial stack spill just prior to a collection, and otherwise operating on registers as much as possible instead of gcroot'ed allocas. Again, I've only considered this as a thought exercise until now, so I apologize if there's some obvious flaw I'm not aware of in my reasoning. I haven't actually done any of this stuff yet so it's more speculative than anything - just trying to creatively engineer around the existing limitations :-)</div>

<div><br></div><div><br></div><div><br></div><div> - Mike</div><div><br></div><div><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Oct 25, 2013 at 5:35 PM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000"><div class="im">

    <div>On 10/25/13 1:10 PM, Ben Karel wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr"><br>

        <div class="gmail_extra"><br>

          <br>

          <div class="gmail_quote">On Thu, Oct 24, 2013 at 6:42 PM,

            Sanjoy Das <span dir="ltr"><<a href="mailto:sanjoy@azulsystems.com" target="_blank">sanjoy@azulsystems.com</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi

              Rafael, Andrew,<br>

              <br>

              Thank you for the prompt reply.<br>

              <br>

              One approach we've been considering involves representing

              the<br>

              constraint "pointers to heap objects are invalidated at

              every<br>

              safepoint" somehow in the IR itself.  So, if %a and %b are

              values the<br>

              GC is interested in, the safepoint at the back-edge of a

              loop might<br>

              look like:<br>

              <br>

                ; <label>: body<br>

                  %a = phi i32 [ %a.relocated, %body ] [

              %a.initial_value, %pred ]<br>

                  %b = phi i32 [ %b.relocated, %body ] [

              %b.initial_value, %pred ]<br>

                  ;; Use %a and %b<br>

              <br>

                  ;; The safepoint starts here<br>

                  %a.relocated = @llvm.gc_relocate(%a)<br>

                  %b.relocated = @llvm.gc_relocate(%b)<br>

                  br %body<br>

              <br>

              This allows us to not bother with relocating derived

              pointers pointing<br>

              inside %a and %b, since it is semantically incorrect for

              llvm to reuse<br>

              them in the next iteration of the loop. </blockquote>

            <div><br>

            </div>

            <div>This is the right general idea, but you can already

              express this constraint in LLVM as it exists today, by

              using llvm.gcroot().  As you noted, this also solves the

              interior-pointer problem by making it the front end's job

              to convey to LLVM when it would/would not be safe to cache

              interior pointers across loop iterations. The invariant

              that a front-end must maintain is that any pointer which

              is live across a given potential-GC-point must be reloaded

              from its root after a (relocating) GC might have occurred.

              This falls naturally out of the viewpoint that %a is not

              an object pointer, it's the name of an object pointer's

              value at a given point in time. So, of course, whenever

              that object pointer's value might change, there must be a

              new name.</div>

          </div>

        </div>

      </div>

    </blockquote></div>

    To rephrase, you're saying that the current gcroot mechanism is

    analogous to a boxed object pointer.  We could model a safepoint by

    storing the unboxed value into the box, applying an opaque operation

    to the box, and reloading the raw object pointer from the box.  (The

    opaque operator is key to prevent the store/load pair from being

    removed.  It also implements the actual safepoint operation.)  Is

    this a correct restatement?<br>

    <br>

    Here's an example:<br>

    gcroot a = ...; b = ...;<br>

    object* ax = *a, bx = *b;<br>

    repeat 500000 iterations { <br>

      spin for 50 instructions<br>

      *a = ax;<br>

      *b = bx;<br>

      safepoint(a, b)<br>

      ax = *a;<br>

      bx = *b;<br>

    } <br>

    <br>

    If so, I would agree with you that this is a correct encoding of

    object relocation.  I think what got me confused was using the

    in-tree examples to reverse engineer the semantics.  Both the Erlang

    and Ocaml examples insert safepoints after all the LLVM IR passes

    are run and derived pointers were inserted.  This was the part that

    got me thinking that the gcroot semantics were unsuitable for a

    precise relocating collector.  If you completely ignored these

    implementations and inserted the full box/safepoint call/unbox

    implementation at each safepoint before invoking any optimizations,

    I think this would be correct.  <br>

    <br>

    One problem with this encoding is that there does not currently

    exist an obvious mechanism to describe a safepoint in the IR

    itself.  (i.e. there is no llvm.gcsafepoint)  You could model this

    with a "well known" function which a custom GCStrategy recorded a

    stackmap on every call.  It would be good to extend the existing

    intrinsics with such a mechanism.  <br>

    <br>

    At least if I'm reading things right, the current *implementation*

    is not a correct implementation of the intended semantics.  In

    particular, the safepoint operation which provides the opaque

    manipulation of the boxed gcroots is not preserved into the

    SelectionDAG.  As a result, optimizations on the SelectionDAG could

    eliminate the now adjacent box/unbox pairs.  This would break a

    relocating collector.  <br>

    <br>

    Leaving this aside, there are also a number of performance problems

    with the current implementation.  I'll summarize them here for now,

    but will expand into approaches for resolving each if this looks

    like the most likely implementation strategy.  <br>

    1) Explicitly adding box/safepoint/unbox prevents creation of

    derived pointers.  This prevents optimizations such as

    loop-blocking.  Ideally, we'd rather let the derived pointers be

    created and capture them for updating at the safepoint.  <br>

    2) The redundant loads and stores required for box/unbox.  (These

    might be partially eliminated by other optimization passes.)<br>

    3) The inability to allocate GC roots to registers.  <br>

    4) Frame sizes are implicitly doubled since both an unboxed and

    boxed representation of every object pointer is required.  <br>

    <br>

    Frankly, I think that (3) is likely solvable, but that (1) and (4)

    could be fatal for a high performance implementation.  I don't have

    hard numbers on that; I'm simply stating my intuition.  <br><div class="im">

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div>The fact that the mutable memory associated with a

              gcroot() is allocated on the stack (rather than, say, a

              machine register) is an implementation detail; fixing it

              doesn't require altering the (conceptual) interface for

              LLVM's existing GC support, AFAICT.</div>

          </div>

        </div>

      </div>

    </blockquote></div>

    While in principal, I agree with you here, in practice the

    difference is actually quite significant.  I am not convinced that

    the implementation would be straight forward.  Let's set this aside

    for the moment and focus on the more fundamental questions.  <br><div class="im">

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div> </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">We

              lower gc_relocate to a<br>

              pseudo opcode which lowered into nothing after register

              allocation.<br>

              <br>

              The problem is, of course, the performance penalty.  Does

              it make<br>

              sense to get the register allocator "see" the gc_relocate

              instruction<br>

              as a copy so that they get the same register / slot?  Will

              that<br>

              violate the intended semantics of gc_relocate (for

              example, after<br>

              assigning the same register / slot to %a and %a.relocated,

              are there<br>

              passes that will try to cache derived pointers across loop<br>

              iterations)?<br>

              <br>

              Thanks,<br>

              -- Sanjoy<br>

              <div>

                <div><br>

                  <br>

                  _______________________________________________<br>

                  LLVM Developers mailing list<br>

                  <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>

                          <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

                  <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

                </div>

              </div>

            </blockquote>

          </div>

          <br>

        </div>

      </div>

      <br>

      <fieldset></fieldset>

      <br>

      <pre>_______________________________________________

LLVM Developers mailing list

<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>

</pre>

    </blockquote>

    <br>

  </div></div>

<br>_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

<br></blockquote></div><br></div></div>

</div></blockquote><blockquote type="cite"><div><span>_______________________________________________</span><br><span>LLVM Developers mailing list</span><br><span><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a></span><br><span><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a></span><br></div></blockquote></body></html>