[LLVMdev] [RFC] Stackmap and Patchpoint Intrinsic Proposal

Tue Oct 22 15:08:22 PDT 2013

On Oct 22, 2013, at 1:48 PM, Philip R <listmail at philipreames.com> wrote:

> On 10/22/13 10:34 AM, Filip Pizlo wrote:
>> On Oct 22, 2013, at 9:53 AM, Philip R <listmail at philipreames.com> wrote:
>> 
>>> On 10/17/13 10:39 PM, Andrew Trick wrote:
>>>> This is a proposal for adding Stackmaps and Patchpoints to LLVM. The
>>>> first client of these features is the JavaScript compiler within the
>>>> open source WebKit project.
>>>> 
>>> I have a couple of comments on your proposal.  None of these are major enough to prevent submission.
>>> 
>>> - As others have said, I'd prefer an experimental namespace rather than a webkit namespace.  (minor)
>>> - Unless I am misreading your proposal, your proposed StackMap intrinsic duplicates existing functionality already in llvm.  In particular, much of the StackMap construction seems similar to the Safepoint mechanism used by the in-tree GC support. (See CodeGen/GCStrategy.cpp and CodeGen/GCMetadata.cpp).  Have you examined these mechanisms to see if you can share implementations?
>>> - To my knowledge, there is nothing that prevents an LLVM optimization pass from manufacturing new pointers which point inside an existing data structure.  (e.g. an interior pointer to an array when blocking a loop)  Does your StackMap mechanism need to be able to inspect/modify these manufactured temporaries?  If so, I don't see how you could generate an intrinsic which would include this manufactured pointer in the live variable list.  Is there something I'm missing here?
>> These stackmaps have nothing to do with GC.  Interior pointers are a problem unique to precise copying collectors.
> I would argue that while the use of the stack maps might be different, the mechanism is fairly similar.

It's not at all similar.  These stackmaps are only useful for deoptimization, since the only way to make use of the live state information is to patch the stackmap with a jump to a deoptimization off-ramp.  You won't use these for a GC.

> In general, if the expected semantics are the same, a shared implementation would be desirable.  This is more a suggestion for future refactoring than anything else.

I think that these stackmaps and GC stackmaps are fairly different beasts.  While it's possible to unify the two, this isn't the intent here.  In particular, you can use these stackmaps for deoptimization without having to unwind the stack.

> 
> I agree that interior pointers are primarily a problem for relocating collectors. (Though I disagree with the characterization of it being *uniquely* a problem for such collectors.)  Since I was unaware of what you're using your stackmap mechanism for, I wanted to ask.  Sounds like this is not an intended use case for you.
>> 
>> In particular, the stackmaps in this proposal are likely to be used for capturing only a select subset of state and that subset may fail to include all possible GC roots.  These stackmaps are meant to be used for reconstructing state-in-bytecode (where bytecode = whatever your baseline execution engine is, could be an AST) for performing a deoptimization, if LLVM was used for compiling code that had some type/value/behavior speculations.
> Thanks for the clarification.  This is definitely a useful mechanism.  Thank you for contributing it back.
>> 
>>> - Your patchpoint mechanism appears to be one very specialized use of a patchable location.  Would you mind renaming it to something like patchablecall to reflect this specialization?
>> The top use case will be heap access dispatch inline cache, which is not a call.
>> You can also use it to implement call inline caches, but that's not the only thing you can use it for.
> Er, possibly I'm misunderstanding you.  To me, a inline call cache is a mechanism to optimize a dynamic call by adding a typecheck+directcall fastpath.

Inline caches don't have to be calls.  For example, in JavaScript, the expression "o.f" is fully dynamic but usually does not result in a call.  The inline cache - and hence patchpoint - for such an expression will not have a call in the common case.

Similar things arise in other dynamic languages.  You can have inline caches for arithmetic.  Or for array accesses.  Or for any other dynamic operation in your language.

>  (i.e. avoiding the dynamic dispatch logic in the common case)  I'm assuming this what you mean with the term "call inline cache", but I have never heard of a "heap access dispatch inline cache".  I've done a google search and didn't find a definition.  Could you point me to a reference or provide a brief explanation?

Every JavaScript engine does it, and usually the term "inline cache" in the context of JS engines implies dispatching on the shape of the object in order to find the offset at which a field is located, rather than dispatching on the class of an object to determine what method to call.

-Filip

> 
> Philip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/88102040/attachment.html>