<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">Adding Gael as someone who has

      previously discussed vmkit topics on the list.  Since I'm assuming

      this is where the GC support came from, I wanted to draw this

      conversation to the attention of someone more familiar with the

      LLVM implementation than myself.<br>

      <br>

      On 10/22/13 4:18 PM, Andrew Trick wrote:<br>

    </div>

    <blockquote

      cite="mid:9D0E9F3E-E55E-477E-BBF0-E6E3C668DE7A@apple.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      On Oct 22, 2013, at 3:08 PM, Filip Pizlo <<a

        moz-do-not-send="true" href="mailto:fpizlo@apple.com">fpizlo@apple.com</a>>

      wrote:<br>

      <div><br>

        <blockquote type="cite">

          <div style="font-family: Helvetica; font-size: 12px;

            font-style: normal; font-variant: normal; font-weight:

            normal; letter-spacing: normal; line-height: normal;

            orphans: auto; text-align: start; text-indent: 0px;

            text-transform: none; white-space: normal; widows: auto;

            word-spacing: 0px; -webkit-text-stroke-width: 0px;">

            <div>On Oct 22, 2013, at 1:48 PM, Philip R <<a

                moz-do-not-send="true"

                href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>>

              wrote:</div>

            <br class="Apple-interchange-newline">

            <blockquote type="cite">

              <div style="font-size: 12px; font-style: normal;

                font-variant: normal; font-weight: normal;

                letter-spacing: normal; line-height: normal; orphans:

                auto; text-align: start; text-indent: 0px;

                text-transform: none; white-space: normal; widows: auto;

                word-spacing: 0px; -webkit-text-stroke-width: 0px;">On

                10/22/13 10:34 AM, Filip Pizlo wrote:<br>

                <blockquote type="cite">On Oct 22, 2013, at 9:53 AM,

                  Philip R <<a moz-do-not-send="true"

                    href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>>

                  wrote:<br>

                  <br>

                  <blockquote type="cite">On 10/17/13 10:39 PM, Andrew

                    Trick wrote:<br>

                    <blockquote type="cite">This is a proposal for

                      adding Stackmaps and Patchpoints to LLVM. The<br>

                      first client of these features is the JavaScript

                      compiler within the<br>

                      open source WebKit project.<br>

                      <br>

                    </blockquote>

                    I have a couple of comments on your proposal.  None

                    of these are major enough to prevent submission.<br>

                    <br>

                    - As others have said, I'd prefer an experimental

                    namespace rather than a webkit namespace.  (minor)<br>

                    - Unless I am misreading your proposal, your

                    proposed StackMap intrinsic duplicates existing

                    functionality already in llvm.  In particular, much

                    of the StackMap construction seems similar to the

                    Safepoint mechanism used by the in-tree GC support.

                    (See CodeGen/GCStrategy.cpp and

                    CodeGen/GCMetadata.cpp).  Have you examined these

                    mechanisms to see if you can share implementations?<br>

                    - To my knowledge, there is nothing that prevents an

                    LLVM optimization pass from manufacturing new

                    pointers which point inside an existing data

                    structure.  (e.g. an interior pointer to an array

                    when blocking a loop)  Does your StackMap mechanism

                    need to be able to inspect/modify these manufactured

                    temporaries?  If so, I don't see how you could

                    generate an intrinsic which would include this

                    manufactured pointer in the live variable list.  Is

                    there something I'm missing here?<br>

                  </blockquote>

                  These stackmaps have nothing to do with GC.  Interior

                  pointers are a problem unique to precise copying

                  collectors.<br>

                </blockquote>

                I would argue that while the use of the stack maps might

                be different, the mechanism is fairly similar.</div>

            </blockquote>

            <div><br>

            </div>

            <div>It's not at all similar.  These stackmaps are only

              useful for deoptimization, since the only way to make use

              of the live state information is to patch the stackmap

              with a jump to a deoptimization off-ramp.  You won't use

              these for a GC.</div>

            <br>

            <blockquote type="cite">

              <div style="font-size: 12px; font-style: normal;

                font-variant: normal; font-weight: normal;

                letter-spacing: normal; line-height: normal; orphans:

                auto; text-align: start; text-indent: 0px;

                text-transform: none; white-space: normal; widows: auto;

                word-spacing: 0px; -webkit-text-stroke-width: 0px;">In

                general, if the expected semantics are the same, a

                shared implementation would be desirable.  This is more

                a suggestion for future refactoring than anything else.<br>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div>I think that these stackmaps and GC stackmaps are

              fairly different beasts.  While it's possible to unify the

              two, this isn't the intent here.  In particular, you can

              use these stackmaps for deoptimization without having to

              unwind the stack.</div>

          </div>

        </blockquote>

      </div>

      <br>

      <div>I think Philip R is asking a good question. To paraphrase: If

        we introduce a generically named feature, shouldn’t it be

        generically useful? Stack maps are used in other ways, and there

        are other kinds of patching. I agree and I think these are

        intended to be generically useful features, but not necessarily

        sufficient for every use.</div>

    </blockquote>

    Thank you for the restatement.  You summarized my view well.  <br>

    <blockquote

      cite="mid:9D0E9F3E-E55E-477E-BBF0-E6E3C668DE7A@apple.com"

      type="cite">

      <div><br>

      </div>

      <div>The proposed stack maps are very different from LLVM’s gcroot

        because gcroot does not provide stack maps! llvm.gcroot

        effectively designates a stack location for each root for the

        duration of the current function, and forces the root to be

        spilled to the stack at all call sites (the client needs to

        disable StackColoring). This is really the opposite of a stack

        map and I’m not aware of any functionality that can be shared.

        It also requires a C++ plugin to process the roots.

        llvm.stackmap generates data in a section that MCJIT clients can

        parse.</div>

    </blockquote>

    Er, I think we're talking past each other again.  Let me lay out my

    current understanding of the terminology and existing infrastructure

    in LLVM.  Please correct me where I go wrong.<br>

    <br>

    stack map - A mapping from "values" to storage locations.  Storage

    locations primarily take the form of register, or stack offsets, but

    could in principal refer to other well known locations (i.e. offsets

    into thread local state).  A stack map is specific to a particular

    PC and describes the state at that instruction only.  <br>

    <br>

    In a precise garbage collector, stack maps are used to ensure that

    the stack can be understood by the collector.  When a stop-the-world

    safepoint is reached, the collector needs to be able to identify any

    pointers to heap objects which may exist on the stack.  This

    explicitly includes both the frame which actually contains the

    safepoint and any caller frames back to the root of thread.  To

    accomplish this, a stack map is generated at any call site and a

    stack map is generated for the safepoint itself.  <br>

    <br>

    In LLVM currently, the GCStrategy records "safepoints" which are

    really points at which stack maps need to be remembered.  (i.e.

    calls and actual stop-the-world safepoints)  The GCMetadata

    mechanism gives a generic way to emit the binary encoding of a stack

    map in a collector specific way.  The current stack maps supported

    by this mechanism only allow abstract locations on the stack which

    force all registers to be spilled around "safepoints" (i.e. calls

    and stop-the-world safepoints).  Also, the set of roots (which are

    recorded in the stack map) must be provided separately using the

    gcroot intrinsic.  <br>

    <br>

    In code:<br>

    - GCPoint in llvm/include/llvm/CodeGen/GCMetadata.h describes a

    request for a location with a stack map.  The SafePoints structure

    in GCFunctionInfo contains a list of these locations.<br>

    - The Ocaml GC is probably the best example of usage.  See

    llvm/lib/CodeGen/AsmPrinter/OcamlGCPrinter.cpp<br>

    <br>

    Note: The summary of existing LLVM details above is based on reading

    the code.  I haven't actually implemented anything which used this

    mechanism yet.  As such, take it with a grain of salt.  <br>

    <br>

    In your change, you are adding a mechanism which is intended to

    enable runtime calls and inline cache patching.  (Right?)  Your

    stack maps seem to match the definition of a stack map I gave above

    and (I believe) the implementation currently in LLVM.  The only

    difference might be that your stack maps are partial (i.e. might not

    contain all "values" which are live at a particular PC) and your

    implementation includes Register locations which the current

    implementation in LLVM does not.  One other possible difference, are

    you intending to include "values" which aren't of pointer type?  <br>

    <br>

    Before moving on, am I interpreting your proposal and changes

    correctly?<br>

    <br>

    Assuming I'm still correct so far, how might we combine these

    implementations?  It looks like your implementation is much more

    mature than what exists in tree at the moment.  One possibility

    would be to express the needed GC stack maps in terms of your new

    infrastructure.  (i.e. convert a GCStrategy request for a safepoint

    into a StackMap (as you've implemented it) with the list of explicit

    GC roots as it's arguments).  What would you think of this?  <br>

    <br>

    p.s. This discussion has gotten sufficiently abstract that it should

    in no way block your plan to submit these changes.  I appreciate

    your willingness to discuss.<br>

    <blockquote

      cite="mid:9D0E9F3E-E55E-477E-BBF0-E6E3C668DE7A@apple.com"

      type="cite">

      <div><br>

      </div>

      <div>If someone wanted to use stack maps for GC, I don’t know why

        they wouldn’t leverage llvm.stackmap. Maybe Filip can see a

        problem with this that I can't. The runtime can add GC roots to

        the stack map just like other live value, and it should know how

        to interpret the records. The intrinsic doesn’t bake in any

        particular interpretation of the mapped values. </div>

    </blockquote>

    I think this a restatement of my last paragraph above which would

    mean we're actually in agreement.  <br>

    <blockquote

      cite="mid:9D0E9F3E-E55E-477E-BBF0-E6E3C668DE7A@apple.com"

      type="cite">

      <div>That said, my proposal deliberately does not cover GC. I

        think that stack maps are the easy part of the problem. The hard

        problem is tracking interior pointers, or for that matter

        exterior/out-of-bounds or swizzled pointers. LLVM’s machine IR

        simply doesn’t have the necessary facilities for doing this. But

        if you don’t need a moving collector, then you don’t need to

        track derived pointers as long as the roots are kept live. In

        that case, llvm.stackmap might be a nice optimization over

        llvm.gcroot.</div>

    </blockquote>

    Oddly enough, I'll be raising the issue of how to go about

    supporting a relocating collector on list shortly.  We've looking

    into this independently, but are at the point we'd like to get

    feedback from others.  :)<br>

    <blockquote

      cite="mid:9D0E9F3E-E55E-477E-BBF0-E6E3C668DE7A@apple.com"

      type="cite">

      <div><br>

      </div>

      <div>Now with regard to patching. I think llvm.patchpoint is

        generally useful for any type of patching I can imagine. It does

        look like a call site in IR, and it’s nice to be able to

        leverage calling conventions to inform the location of

        arguments. </div>

    </blockquote>

    Agreed.  My concern is mostly about naming and documentation of

    intended usages.  Speaking as someone who's likely to be using this

    in the very near future, I'd like to make sure I understand how you

    intend it to be used.  The last thing I want to do is misconstrue

    your intent and become reliant on a quirk of the implementation you

    later want to change.<br>

    <br>

    <blockquote

      cite="mid:9D0E9F3E-E55E-477E-BBF0-E6E3C668DE7A@apple.com"

      type="cite">

      <div>But the patchpoint does not have to be a call after patching,

        and you can specify zero arguments to avoid using a calling

        convention. </div>

    </blockquote>

    Er, not quite true.  Your calling convention also influences what

    registers stay live across the call.  But in general, I see your

    point.<br>

    <br>

    (Again, this is touching an area of LLVM I'm not particularly

    familiar with.)<br>

    <blockquote

      cite="mid:9D0E9F3E-E55E-477E-BBF0-E6E3C668DE7A@apple.com"

      type="cite">

      <div>In fact, we only currently emit a call out of convenience. We

        could splat nops in place and assume the runtime will

        immediately find and patch all occurrences before the code

        executes. In the future we may want to handle NULL call target,

        bypass call emission, and allow the reserved bytes to be less

        than that required to emit a call.</div>

    </blockquote>

    If you were to do that, how would the implementation be different

    then the new stackmap intrinsic?  Does that difference imply a

    clarification in intended usage or naming?<br>

    <br>

    p.s. The naming discussion has gotten rather abstract and is

    starting to feel like a "what color is the bikeshed" discussion. 

    Feel free to just tell me to go away at some point.  :)<br>

    <br>

    Philip<br>

    <br>

  </body>

</html>