[LLVMdev] [RFC] Stackmap and Patchpoint Intrinsic Proposal

Philip R listmail at philipreames.com
Tue Oct 22 18:24:38 PDT 2013


Adding Gael as someone who has previously discussed vmkit topics on the 
list.  Since I'm assuming this is where the GC support came from, I 
wanted to draw this conversation to the attention of someone more 
familiar with the LLVM implementation than myself.

On 10/22/13 4:18 PM, Andrew Trick wrote:
> On Oct 22, 2013, at 3:08 PM, Filip Pizlo <fpizlo at apple.com 
> <mailto:fpizlo at apple.com>> wrote:
>
>> On Oct 22, 2013, at 1:48 PM, Philip R <listmail at philipreames.com 
>> <mailto:listmail at philipreames.com>> wrote:
>>
>>> On 10/22/13 10:34 AM, Filip Pizlo wrote:
>>>> On Oct 22, 2013, at 9:53 AM, Philip R <listmail at philipreames.com 
>>>> <mailto:listmail at philipreames.com>> wrote:
>>>>
>>>>> On 10/17/13 10:39 PM, Andrew Trick wrote:
>>>>>> This is a proposal for adding Stackmaps and Patchpoints to LLVM. The
>>>>>> first client of these features is the JavaScript compiler within the
>>>>>> open source WebKit project.
>>>>>>
>>>>> I have a couple of comments on your proposal.  None of these are 
>>>>> major enough to prevent submission.
>>>>>
>>>>> - As others have said, I'd prefer an experimental namespace rather 
>>>>> than a webkit namespace.  (minor)
>>>>> - Unless I am misreading your proposal, your proposed StackMap 
>>>>> intrinsic duplicates existing functionality already in llvm.  In 
>>>>> particular, much of the StackMap construction seems similar to the 
>>>>> Safepoint mechanism used by the in-tree GC support. (See 
>>>>> CodeGen/GCStrategy.cpp and CodeGen/GCMetadata.cpp).  Have you 
>>>>> examined these mechanisms to see if you can share implementations?
>>>>> - To my knowledge, there is nothing that prevents an LLVM 
>>>>> optimization pass from manufacturing new pointers which point 
>>>>> inside an existing data structure.  (e.g. an interior pointer to 
>>>>> an array when blocking a loop)  Does your StackMap mechanism need 
>>>>> to be able to inspect/modify these manufactured temporaries?  If 
>>>>> so, I don't see how you could generate an intrinsic which would 
>>>>> include this manufactured pointer in the live variable list.  Is 
>>>>> there something I'm missing here?
>>>> These stackmaps have nothing to do with GC.  Interior pointers are 
>>>> a problem unique to precise copying collectors.
>>> I would argue that while the use of the stack maps might be 
>>> different, the mechanism is fairly similar.
>>
>> It's not at all similar.  These stackmaps are only useful for 
>> deoptimization, since the only way to make use of the live state 
>> information is to patch the stackmap with a jump to a deoptimization 
>> off-ramp.  You won't use these for a GC.
>>
>>> In general, if the expected semantics are the same, a shared 
>>> implementation would be desirable.  This is more a suggestion for 
>>> future refactoring than anything else.
>>
>> I think that these stackmaps and GC stackmaps are fairly different 
>> beasts.  While it's possible to unify the two, this isn't the intent 
>> here.  In particular, you can use these stackmaps for deoptimization 
>> without having to unwind the stack.
>
> I think Philip R is asking a good question. To paraphrase: If we 
> introduce a generically named feature, shouldn’t it be generically 
> useful? Stack maps are used in other ways, and there are other kinds 
> of patching. I agree and I think these are intended to be generically 
> useful features, but not necessarily sufficient for every use.
Thank you for the restatement.  You summarized my view well.
>
> The proposed stack maps are very different from LLVM’s gcroot because 
> gcroot does not provide stack maps! llvm.gcroot effectively designates 
> a stack location for each root for the duration of the current 
> function, and forces the root to be spilled to the stack at all call 
> sites (the client needs to disable StackColoring). This is really the 
> opposite of a stack map and I’m not aware of any functionality that 
> can be shared. It also requires a C++ plugin to process the roots. 
> llvm.stackmap generates data in a section that MCJIT clients can parse.
Er, I think we're talking past each other again.  Let me lay out my 
current understanding of the terminology and existing infrastructure in 
LLVM.  Please correct me where I go wrong.

stack map - A mapping from "values" to storage locations.  Storage 
locations primarily take the form of register, or stack offsets, but 
could in principal refer to other well known locations (i.e. offsets 
into thread local state).  A stack map is specific to a particular PC 
and describes the state at that instruction only.

In a precise garbage collector, stack maps are used to ensure that the 
stack can be understood by the collector.  When a stop-the-world 
safepoint is reached, the collector needs to be able to identify any 
pointers to heap objects which may exist on the stack.  This explicitly 
includes both the frame which actually contains the safepoint and any 
caller frames back to the root of thread.  To accomplish this, a stack 
map is generated at any call site and a stack map is generated for the 
safepoint itself.

In LLVM currently, the GCStrategy records "safepoints" which are really 
points at which stack maps need to be remembered.  (i.e. calls and 
actual stop-the-world safepoints)  The GCMetadata mechanism gives a 
generic way to emit the binary encoding of a stack map in a collector 
specific way.  The current stack maps supported by this mechanism only 
allow abstract locations on the stack which force all registers to be 
spilled around "safepoints" (i.e. calls and stop-the-world safepoints).  
Also, the set of roots (which are recorded in the stack map) must be 
provided separately using the gcroot intrinsic.

In code:
- GCPoint in llvm/include/llvm/CodeGen/GCMetadata.h describes a request 
for a location with a stack map.  The SafePoints structure in 
GCFunctionInfo contains a list of these locations.
- The Ocaml GC is probably the best example of usage.  See 
llvm/lib/CodeGen/AsmPrinter/OcamlGCPrinter.cpp

Note: The summary of existing LLVM details above is based on reading the 
code.  I haven't actually implemented anything which used this mechanism 
yet.  As such, take it with a grain of salt.

In your change, you are adding a mechanism which is intended to enable 
runtime calls and inline cache patching.  (Right?)  Your stack maps seem 
to match the definition of a stack map I gave above and (I believe) the 
implementation currently in LLVM.  The only difference might be that 
your stack maps are partial (i.e. might not contain all "values" which 
are live at a particular PC) and your implementation includes Register 
locations which the current implementation in LLVM does not.  One other 
possible difference, are you intending to include "values" which aren't 
of pointer type?

Before moving on, am I interpreting your proposal and changes correctly?

Assuming I'm still correct so far, how might we combine these 
implementations?  It looks like your implementation is much more mature 
than what exists in tree at the moment.  One possibility would be to 
express the needed GC stack maps in terms of your new infrastructure.  
(i.e. convert a GCStrategy request for a safepoint into a StackMap (as 
you've implemented it) with the list of explicit GC roots as it's 
arguments).  What would you think of this?

p.s. This discussion has gotten sufficiently abstract that it should in 
no way block your plan to submit these changes.  I appreciate your 
willingness to discuss.
>
> If someone wanted to use stack maps for GC, I don’t know why they 
> wouldn’t leverage llvm.stackmap. Maybe Filip can see a problem with 
> this that I can't. The runtime can add GC roots to the stack map just 
> like other live value, and it should know how to interpret the 
> records. The intrinsic doesn’t bake in any particular interpretation 
> of the mapped values.
I think this a restatement of my last paragraph above which would mean 
we're actually in agreement.
> That said, my proposal deliberately does not cover GC. I think that 
> stack maps are the easy part of the problem. The hard problem is 
> tracking interior pointers, or for that matter exterior/out-of-bounds 
> or swizzled pointers. LLVM’s machine IR simply doesn’t have the 
> necessary facilities for doing this. But if you don’t need a moving 
> collector, then you don’t need to track derived pointers as long as 
> the roots are kept live. In that case, llvm.stackmap might be a nice 
> optimization over llvm.gcroot.
Oddly enough, I'll be raising the issue of how to go about supporting a 
relocating collector on list shortly.  We've looking into this 
independently, but are at the point we'd like to get feedback from 
others.  :)
>
> Now with regard to patching. I think llvm.patchpoint is generally 
> useful for any type of patching I can imagine. It does look like a 
> call site in IR, and it’s nice to be able to leverage calling 
> conventions to inform the location of arguments.
Agreed.  My concern is mostly about naming and documentation of intended 
usages.  Speaking as someone who's likely to be using this in the very 
near future, I'd like to make sure I understand how you intend it to be 
used.  The last thing I want to do is misconstrue your intent and become 
reliant on a quirk of the implementation you later want to change.

> But the patchpoint does not have to be a call after patching, and you 
> can specify zero arguments to avoid using a calling convention.
Er, not quite true.  Your calling convention also influences what 
registers stay live across the call.  But in general, I see your point.

(Again, this is touching an area of LLVM I'm not particularly familiar 
with.)
> In fact, we only currently emit a call out of convenience. We could 
> splat nops in place and assume the runtime will immediately find and 
> patch all occurrences before the code executes. In the future we may 
> want to handle NULL call target, bypass call emission, and allow the 
> reserved bytes to be less than that required to emit a call.
If you were to do that, how would the implementation be different then 
the new stackmap intrinsic?  Does that difference imply a clarification 
in intended usage or naming?

p.s. The naming discussion has gotten rather abstract and is starting to 
feel like a "what color is the bikeshed" discussion. Feel free to just 
tell me to go away at some point.  :)

Philip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/cc89da09/attachment.html>


More information about the llvm-dev mailing list