<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 10/22/13 11:12 PM, Andrew Trick
wrote:<br>
</div>
<blockquote
cite="mid:E6B326E6-0532-4F26-AA0D-A9113476C79C@apple.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div>I'm moving this to a different thread. I think the newly
proposed</div>
<div>intrinsic definitions and their current implementation are
valuable</div>
<div>regardless of how it gets tied into GC...</div>
</blockquote>
Agreed. As Gaël said, I'm looking forward to being able to play
with this in tree. :)<br>
<blockquote
cite="mid:E6B326E6-0532-4F26-AA0D-A9113476C79C@apple.com"
type="cite">
<div><br>
</div>
<div>
<div>On Oct 22, 2013, at 6:24 PM, Philip R <<a
moz-do-not-send="true"
href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
<div bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Adding Gael as someone who has
previously discussed vmkit topics on the list. Since I'm
assuming this is where the GC support came from, I wanted
to draw this conversation to the attention of someone more
familiar with the LLVM implementation than myself.<br>
<br>
On 10/22/13 4:18 PM, Andrew Trick wrote:<br>
</div>
<blockquote
cite="mid:9D0E9F3E-E55E-477E-BBF0-E6E3C668DE7A@apple.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
On Oct 22, 2013, at 3:08 PM, Filip Pizlo <<a
moz-do-not-send="true" href="mailto:fpizlo@apple.com">fpizlo@apple.com</a>>
wrote:<br>
<div><br>
<blockquote type="cite">
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal;
font-weight: normal; letter-spacing: normal;
line-height: normal; orphans: auto; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; widows: auto; word-spacing:
0px; -webkit-text-stroke-width: 0px;">
<div>On Oct 22, 2013, at 1:48 PM, Philip R <<a
moz-do-not-send="true"
href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<div style="font-size: 12px; font-style: normal;
font-variant: normal; font-weight: normal;
letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent:
0px; text-transform: none; white-space: normal;
widows: auto; word-spacing: 0px;
-webkit-text-stroke-width: 0px;">On 10/22/13
10:34 AM, Filip Pizlo wrote:<br>
<blockquote type="cite">On Oct 22, 2013, at 9:53
AM, Philip R <<a moz-do-not-send="true"
href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>>
wrote:<br>
<br>
<blockquote type="cite">On 10/17/13 10:39 PM,
Andrew Trick wrote:<br>
<blockquote type="cite">This is a proposal
for adding Stackmaps and Patchpoints to
LLVM. The<br>
first client of these features is the
JavaScript compiler within the<br>
open source WebKit project.<br>
<br>
</blockquote>
I have a couple of comments on your
proposal. None of these are major enough to
prevent submission.<br>
<br>
- As others have said, I'd prefer an
experimental namespace rather than a webkit
namespace. (minor)<br>
- Unless I am misreading your proposal, your
proposed StackMap intrinsic duplicates
existing functionality already in llvm. In
particular, much of the StackMap
construction seems similar to the Safepoint
mechanism used by the in-tree GC support.
(See CodeGen/GCStrategy.cpp and
CodeGen/GCMetadata.cpp). Have you examined
these mechanisms to see if you can share
implementations?<br>
- To my knowledge, there is nothing that
prevents an LLVM optimization pass from
manufacturing new pointers which point
inside an existing data structure. (e.g. an
interior pointer to an array when blocking a
loop) Does your StackMap mechanism need to
be able to inspect/modify these manufactured
temporaries? If so, I don't see how you
could generate an intrinsic which would
include this manufactured pointer in the
live variable list. Is there something I'm
missing here?<br>
</blockquote>
These stackmaps have nothing to do with GC.
Interior pointers are a problem unique to
precise copying collectors.<br>
</blockquote>
I would argue that while the use of the stack
maps might be different, the mechanism is fairly
similar.</div>
</blockquote>
<div><br>
</div>
<div>It's not at all similar. These stackmaps are
only useful for deoptimization, since the only way
to make use of the live state information is to
patch the stackmap with a jump to a deoptimization
off-ramp. You won't use these for a GC.</div>
<br>
<blockquote type="cite">
<div style="font-size: 12px; font-style: normal;
font-variant: normal; font-weight: normal;
letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent:
0px; text-transform: none; white-space: normal;
widows: auto; word-spacing: 0px;
-webkit-text-stroke-width: 0px;">In general, if
the expected semantics are the same, a shared
implementation would be desirable. This is more
a suggestion for future refactoring than
anything else.<br>
</div>
</blockquote>
<div><br>
</div>
<div>I think that these stackmaps and GC stackmaps
are fairly different beasts. While it's possible
to unify the two, this isn't the intent here. In
particular, you can use these stackmaps for
deoptimization without having to unwind the stack.</div>
</div>
</blockquote>
</div>
<br>
<div>I think Philip R is asking a good question. To
paraphrase: If we introduce a generically named feature,
shouldn’t it be generically useful? Stack maps are used
in other ways, and there are other kinds of patching. I
agree and I think these are intended to be generically
useful features, but not necessarily sufficient for
every use.</div>
</blockquote>
Thank you for the restatement. You summarized my view
well. <br>
<blockquote
cite="mid:9D0E9F3E-E55E-477E-BBF0-E6E3C668DE7A@apple.com"
type="cite">
<div><br>
</div>
<div>The proposed stack maps are very different from
LLVM’s gcroot because gcroot does not provide stack
maps! llvm.gcroot effectively designates a stack
location for each root for the duration of the current
function, and forces the root to be spilled to the stack
at all call sites (the client needs to disable
StackColoring). This is really the opposite of a stack
map and I’m not aware of any functionality that can be
shared. It also requires a C++ plugin to process the
roots. llvm.stackmap generates data in a section that
MCJIT clients can parse.</div>
</blockquote>
Er, I think we're talking past each other again. Let me lay
out my current understanding of the terminology and existing
infrastructure in LLVM. Please correct me where I go wrong.<br>
<br>
stack map - A mapping from "values" to storage locations.
Storage locations primarily take the form of register, or
stack offsets, but could in principal refer to other well
known locations (i.e. offsets into thread local state). A
stack map is specific to a particular PC and describes the
state at that instruction only. <br>
<br>
In a precise garbage collector, stack maps are used to
ensure that the stack can be understood by the collector.
When a stop-the-world safepoint is reached, the collector
needs to be able to identify any pointers to heap objects
which may exist on the stack. This explicitly includes both
the frame which actually contains the safepoint and any
caller frames back to the root of thread. To accomplish
this, a stack map is generated at any call site and a stack
map is generated for the safepoint itself. <br>
<br>
In LLVM currently, the GCStrategy records "safepoints" which
are really points at which stack maps need to be
remembered. (i.e. calls and actual stop-the-world
safepoints) The GCMetadata mechanism gives a generic way to
emit the binary encoding of a stack map in a collector
specific way. The current stack maps supported by this
mechanism only allow abstract locations on the stack which
force all registers to be spilled around "safepoints" (i.e.
calls and stop-the-world safepoints). Also, the set of
roots (which are recorded in the stack map) must be provided
separately using the gcroot intrinsic. <br>
<br>
In code:<br>
- GCPoint in llvm/include/llvm/CodeGen/GCMetadata.h
describes a request for a location with a stack map. The
SafePoints structure in GCFunctionInfo contains a list of
these locations.<br>
- The Ocaml GC is probably the best example of usage. See
llvm/lib/CodeGen/AsmPrinter/OcamlGCPrinter.cpp<br>
<br>
Note: The summary of existing LLVM details above is based on
reading the code. I haven't actually implemented anything
which used this mechanism yet. As such, take it with a
grain of salt. <br>
</div>
</blockquote>
<div><br>
</div>
<div>
<div>That's an excellent description of stack maps,
GCStrategy, and</div>
<div>safepoints. Now let me explain how I see it.</div>
<div><br>
</div>
<div>GCStrategy provides layers of abstraction that allow
plugins to</div>
<div>specialize GC metadata. Conceptually, a plugin can
generate what looks</div>
<div>like stack map data to the collector. But there isn't any
direct</div>
<div>support in LLVM IR for the kind of stack maps that we
need.</div>
<div><br>
</div>
<div>When I talk about adding stack map support, I'm really
talking about</div>
<div>support for mapping values to registers, where the set of
values and</div>
<div>their locations are specific to the "safepoint".</div>
<div><br>
</div>
<div>We're adding an underlying implementation of
per-safepoint live</div>
<div>values. There isn't a lot of abstraction built up around
it. Just a</div>
<div>couple of intrinsics that directly expose the
functionality.</div>
<div><br>
</div>
<div>We're also approaching the interface very differently.
We're enabling</div>
<div>an MCJIT client. The interface to the client is the stack
map format.</div>
</div>
</div>
</blockquote>
For the record, I actually prefer your approach to the interface.
:)<br>
<blockquote
cite="mid:E6B326E6-0532-4F26-AA0D-A9113476C79C@apple.com"
type="cite">
<div>
<div>
<div><br>
</div>
<div><br>
</div>
</div>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"> In your change, you are
adding a mechanism which is intended to enable runtime calls
and inline cache patching. (Right?) Your stack maps seem
to match the definition of a stack map I gave above and (I
believe) the implementation currently in LLVM. The only
difference might be that your stack maps are partial (i.e.
might not contain all "values" which are live at a
particular PC) and your implementation includes Register
locations which the current implementation in LLVM does
not. One other possible difference, are you intending to
include "values" which aren't of pointer type? <br>
</div>
</blockquote>
<div><br>
</div>
<div>
<div>Yes, the values will be of various types (although only
32/64 bit</div>
<div>types are currently allowed because of DWARF register
number</div>
<div>weirdness). More importantly, our stack maps record
locations of a</div>
<div>specific set of values, which may be in registers, at a
specific</div>
<div>location. </div>
</div>
</div>
</blockquote>
The fact that you're interested in more than information about which
locations contain pointers into the heap is the key point here.
Your stack map is actually slightly more general than the form used
by a garbage collector. For example, your mechanism allows you to
describe where the iteration variable ("int i") in a loop lives.
This is not something a stack map (in the sense I've been using it
to refer to GC usage) would enable.<br>
<blockquote
cite="mid:E6B326E6-0532-4F26-AA0D-A9113476C79C@apple.com"
type="cite">
<div>
<div>
<div>In fact, that, along with reserving space for code
patching,</div>
<div>is *all* we're doing. GCRoot doesn't do this at all. So
there is</div>
<div>effectively no overlap in implementation.</div>
</div>
</div>
</blockquote>
I'm actually come around to agree with you. I think you're slightly
misunderstanding the role of "safepoints" and "gcroot" in the
current implementation, but the fact that your mechanism is
significantly more general than standard GC stackmaps (which LLVM
implements currently) is a strong reason for having them as a
separate implementation. <br>
<br>
(For performance reasons, a GC framework would probably want to use
more concise stack maps which only encode pointer roots.)<br>
<blockquote
cite="mid:E6B326E6-0532-4F26-AA0D-A9113476C79C@apple.com"
type="cite">
<div>
<div><br>
</div>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"> <br>
Before moving on, am I interpreting your proposal and
changes correctly?<br>
</div>
</blockquote>
<div><br>
</div>
<div>Yes, except I don’t see a direct connection between the
functionality we’re</div>
<div>adding and “the implementation currently in LLVM”.</div>
<br>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"> Assuming I'm still
correct so far, how might we combine these implementations?
It looks like your implementation is much more mature than
what exists in tree at the moment. One possibility would be
to express the needed GC stack maps in terms of your new
infrastructure. (i.e. convert a GCStrategy request for a
safepoint into a StackMap (as you've implemented it) with
the list of explicit GC roots as it's arguments). What
would you think of this? <br>
</div>
</blockquote>
</div>
<br>
<div>
<div>
<div>I can imagine someone wanting to leverage some of the new</div>
<div>implementation without using it end-to-end as-is.
Although I'm not</div>
<div>entirely sure what the motivation would be. For example:</div>
<div><br>
</div>
<div>- A CodeGenPrepare pass could insert llvm.safepoint or
llvm.patchpoint</div>
<div> calls at custom safepoints after determining GC root
liveness at</div>
<div> those points.</div>
</div>
</div>
</blockquote>
<blockquote
cite="mid:E6B326E6-0532-4F26-AA0D-A9113476C79C@apple.com"
type="cite">
<div>
<div>
<div><br>
</div>
<div>- Something like a GCStrategy could intercept our
implementation of</div>
<div> stack map generation and emit a custom format. Keep in
mind though</div>
<div> that the format that LLVM emits does not need to be the
format read</div>
<div> by the collector. The JIT/runtime can parse LLVM's
stack map data</div>
<div> and encode it using it's own data structures. That way,
the</div>
<div> JIT/runtime can change without customizing LLVM.</div>
</div>
</div>
</blockquote>
I think this is a very good point. Alternately, you could frame
your encoding as being the default representation provided by LLVM
and provide a plugin mechanism to modify it. (Not proposing this
should actually be done at the moment. This would be by demand
only.)<br>
<blockquote
cite="mid:E6B326E6-0532-4F26-AA0D-A9113476C79C@apple.com"
type="cite">
<div>
<div>
<div><br>
</div>
</div>
<div>
<div>As far as hooking the new stack map support into the
GCMetaData</div>
<div>abstraction, I'm not sure how that would work.
GCMachineCodeAnalysis</div>
<div>is currently a standalone MI pass. We can't generate our
stack maps</div>
<div>here. Technically, a preEmitPass can come along later and
reassign</div>
<div>registers invalidating the stack map. That's why we
generate the maps</div>
<div>during MC lowering.</div>
</div>
</div>
</blockquote>
I agree. I think this is actually a problem with the existing
implementation as well. It gets around it by (I believe) forcing
all roots to the stack when a stack map is needed. <br>
<blockquote
cite="mid:E6B326E6-0532-4F26-AA0D-A9113476C79C@apple.com"
type="cite">
<div>
<div>
<div><br>
</div>
<div>So, currently, the new intrinsics are serving a different
purpose than</div>
<div>GCMetaData. I think someone working on GC support needs
to be</div>
<div>convinced that they really need the new stack map
features. Then we</div>
<div>can build something on top of the underlying
functionality that works</div>
<div>for them.</div>
</div>
<div><br>
</div>
</div>
<div>-Andy</div>
</blockquote>
Just to note, that person working on the GC support is very likely
to be me (or one of my coworkers) in the near future. That's why
I've been so interested in your changes. :)<br>
<br>
As background, we're investing using LLVM as a JIT compiler for a VM
which uses a precise relocating collector. The existing collector
support appears problematic with regards to a relocating collector
and we're investigating approaches to enhance it. My coworker will
be opening another thread on that topic in the next few days.<br>
<br>
Philip<br>
<br>
</body>
</html>