<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div><br></div><div><br>On Oct 13, 2014, at 6:49 PM, Philip Reames <<a href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>> wrote:<br><br></div><blockquote type="cite"><div>
<meta content="text/html; charset=windows-1252" http-equiv="Content-Type">
<br>
<div class="moz-cite-prefix">On 10/13/2014 06:17 PM, Filip Pizlo
wrote:<br>
</div>
<blockquote cite="mid:1BFF2ADA-E406-43CF-A09E-DF280C91F79A@apple.com" type="cite">
<meta http-equiv="content-type" content="text/html;
charset=windows-1252">
<div><br>
</div>
<div><br>
On Oct 13, 2014, at 4:07 PM, Philip Reames <<a moz-do-not-send="true" href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>>
wrote:<br>
<br>
</div>
<blockquote type="cite">
<div>
<meta content="text/html; charset=windows-1252" http-equiv="Content-Type">
<br>
<div class="moz-cite-prefix">On 10/13/2014 03:23 PM, Kevin
Modzelewski wrote:<br>
</div>
<blockquote cite="mid:CAO=oM6skx2WWW8=659D=DJwwnQHH0csur02D_WwyKx5HUqWeug@mail.gmail.com" type="cite">
<div dir="ltr">With the patchpoint infrastructure, shouldn't
it now be relatively straightforward to do an
accurate-but-non-relocatable scan of the stack, by
attaching all the GC roots as stackmap arguments to
patchpoints? This is something we're currently working on
for Pyston (ie we don't have it working yet), but I think
we might get it "for free" once we finish the work on
frame introspection.</div>
</blockquote>
Take a look at the statepoint intrinsics up for review. These
are essentially exactly that, with two extensions:<br>
- A semantic distinction between gc roots and deopt state
(since you may want both)<br>
- Support for explicit relocation of the gc root values (this
could be made optional, but is currently not)<br>
<br>
Though, you really don't want to emit these in your frontend.
You can, it'll work, but the performance will suffer. Doing
so will prevent many useful optimizations from running. </div>
</blockquote>
<div><br>
</div>
<div>You really should be specific here. The optimizations you're
thinking of may be uninteresting to many clients. <br>
</div>
</blockquote>
Assuming you have a VM which needs safepoints to occur at some fixed
interval, you need to put a safepoint poll in *all* loops. (Well,
unless you can prove either a) the loop is bounded or b) there's
another safepoint in the loop.) Doing so, you introduce a call into
the loop (using either approach). This breaks loop recognition,
complicates alias analysis and thus LICM, and is otherwise bad for
the optimizer.<br></div></blockquote><div><br></div><div>A multithreaded high-throughout VM will have deopt safe points in any loop that isn't proven to terminate in a timely fashion. Those safe points will take all live state and this will be superset of your GC pointers. </div><br><blockquote type="cite"><div>
<br>
<blockquote cite="mid:1BFF2ADA-E406-43CF-A09E-DF280C91F79A@apple.com" type="cite">
<div>Also you won't lose any performance if your GC pointers are
also needed for deopt (which happens to be the common case).</div>
</blockquote>
<br>
<blockquote cite="mid:1BFF2ADA-E406-43CF-A09E-DF280C91F79A@apple.com" type="cite">
<div>I really do think that this whole discussion is tragicomic.
Most clients of LLVM would be best served with mostly copying
GC. <br>
</div>
</blockquote>
I believe LLVM should not take a position in this debate and should
try to support all collectors. <br></div></blockquote><div><br></div><div>It's good to encourage people to use state-of-the-art, easy-to-implement techniques rather than unnecessarily complicated ones. </div><div><br></div><div>-Filip</div><br><blockquote type="cite"><div>
<blockquote cite="mid:1BFF2ADA-E406-43CF-A09E-DF280C91F79A@apple.com" type="cite">
<div><br>
</div>
<div>-Filip</div>
<div><br>
</div>
<br>
<blockquote type="cite">
<div> Instead, you probably want to consider something like the
late safepoint placement approach we've been pushing.
Hopefully, once the statepoint stuff lands, we can get that
upstreamed fairly soon. <br>
<br>
Philip<br>
<br>
<blockquote cite="mid:CAO=oM6skx2WWW8=659D=DJwwnQHH0csur02D_WwyKx5HUqWeug@mail.gmail.com" type="cite">
<div class="gmail_extra"><br>
<div class="gmail_quote">On Sat, Oct 11, 2014 at 11:37 PM,
Filip Pizlo <span dir="ltr"><<a moz-do-not-send="true" href="mailto:fpizlo@apple.com" target="_blank">fpizlo@apple.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
<br>
> On Oct 10, 2014, at 6:24 PM, Hayden Livingston
<<a moz-do-not-send="true" href="mailto:halivingston@gmail.com">halivingston@gmail.com</a>>
wrote:<br>
><br>
> Hello,<br>
><br>
> I was wondering if there is an example list
somewhere of whole program optimizations done by
LLVM based compilers?<br>
><br>
> I'm only familiar with method-level
optimizations, and I'm being told wpo can deliver
many great speedups.<br>
><br>
> My language is currently staticly typed JIT
based and uses the JVM, and I want to move it over
to LLVM so that I can have options where it can be
ahead of time compiled as well.<br>
<br>
</span>As Philip kindly pointed out, WebKit uses llvm
as part of a JavaScript JIT optimization pipeline. It
works well for WebKit, but this was a large amount of
work. It may not be the path of least resistance
depending on what your requirements are.<br>
<span class=""><br>
><br>
> I'm hearing bad things about LLVM's JIT
capabilities -- specifically that writing your own
GC is going to be a pain.<br>
<br>
</span>This is a fun topic and you'll probably get
some good advice. :-)<br>
<br>
Here's my take. GC in llvm is only a pain if you make
the tragic mistake of writing an accurate-on-the-stack
GC. Accurate collectors are only known to be
beneficial in niche environments, usually if you have
an aversion to probabilistic algorithms. You might
also be stuck requiring accuracy if your system relies
on being able to force *every* object to *immediately*
move to a new location, but this is an uncommon
requirement - usually it happens due to certain
speculative optimization strategies in dynamic
languages.<br>
<br>
My approach is to use a Bartlett-style mostly-copying
collector. If you use a Bartlett-style collector then
you don't need any special support in llvm. It just
works, it allows llvm to register-allocate pointers at
will, and it lends itself naturally to high-throughput
collector algorithms. Bartlett-style collectors come
in many shapes and sizes - copying or not, mark-region
or not, generational or not, and even a fancy
concurrent copying example exists.<br>
<br>
WebKit used a Bartlett-style parallel generational
sticky-mark copying collector with opportunistic
mark-region optimizations. We haven't written up
anything about it yet but it is all open source.<br>
<br>
Hosking's paper about the concurrent variant is here:
<a moz-do-not-send="true" href="http://dl.acm.org/citation.cfm?doid=1133956.1133963" target="_blank">http://dl.acm.org/citation.cfm?doid=1133956.1133963</a><br>
<br>
I highly recommend reading Bartlett's original paper
about conservative copying; it provides an excellent
semi space algorithm that would be a respectable
starting point for any VM. You won't regret
implementing it - it'll simplify your interface to any
JIT, not just llvm. It'll also make FFI easy because
it allows the C stack to refer directly to GC objects
without any shenanigans.<br>
<br>
Bartlett is probabilistic in the sense that it may,
with low probability, increase object drag. This
happens rarely. On 64-bit systems it's especially
rare. It's been pretty well demonstrated that Bartlett
collectors are as fast as accurate ones, insofar as
anything in GC land can be demonstrated (as in it's
still a topic of lively debate, though I had some
papers back in the day that showed some comparisons).
WebKit often wins GC benchmarks for example, and we
particularly like that our GC never imposes
limitations on llvm optimizations. It's really great
to be able to view the compiler and the collector as
orthogonal components!<br>
<span class="im HOEnZb"><br>
><br>
> Anyways, sort of diverged there, but still
looking for WPO examples!<br>
><br>
> Hayden.<br>
</span>
<div class="HOEnZb">
<div class="h5">>
_______________________________________________<br>
> LLVM Developers mailing list<br>
> <a moz-do-not-send="true" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>
<a moz-do-not-send="true" href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
> <a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a moz-do-not-send="true" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>
<a moz-do-not-send="true" href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
LLVM Developers mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>
</pre>
</blockquote>
<br>
</div>
</blockquote>
</blockquote>
<br>
</div></blockquote></body></html>