<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div><br></div><div><br>On Oct 13, 2014, at 4:07 PM, Philip Reames <<a href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>> wrote:<br><br></div><blockquote type="cite"><div>
<meta content="text/html; charset=windows-1252" http-equiv="Content-Type">
<br>
<div class="moz-cite-prefix">On 10/13/2014 03:23 PM, Kevin
Modzelewski wrote:<br>
</div>
<blockquote cite="mid:CAO=oM6skx2WWW8=659D=DJwwnQHH0csur02D_WwyKx5HUqWeug@mail.gmail.com" type="cite">
<div dir="ltr">With the patchpoint infrastructure, shouldn't it
now be relatively straightforward to do an
accurate-but-non-relocatable scan of the stack, by attaching all
the GC roots as stackmap arguments to patchpoints? This is
something we're currently working on for Pyston (ie we don't
have it working yet), but I think we might get it "for free"
once we finish the work on frame introspection.</div>
</blockquote>
Take a look at the statepoint intrinsics up for review. These are
essentially exactly that, with two extensions:<br>
- A semantic distinction between gc roots and deopt state (since you
may want both)<br>
- Support for explicit relocation of the gc root values (this could
be made optional, but is currently not)<br>
<br>
Though, you really don't want to emit these in your frontend. You
can, it'll work, but the performance will suffer. Doing so will
prevent many useful optimizations from running. </div></blockquote><div><br></div><div>You really should be specific here. The optimizations you're thinking of may be uninteresting to many clients. </div><div><br></div><div>Also you won't lose any performance if your GC pointers are also needed for deopt (which happens to be the common case). </div><div><br></div><div>I really do think that this whole discussion is tragicomic. Most clients of LLVM would be best served with mostly copying GC. </div><div><br></div><div>-Filip</div><div><br></div><br><blockquote type="cite"><div> Instead, you
probably want to consider something like the late safepoint
placement approach we've been pushing. Hopefully, once the
statepoint stuff lands, we can get that upstreamed fairly soon. <br>
<br>
Philip<br>
<br>
<blockquote cite="mid:CAO=oM6skx2WWW8=659D=DJwwnQHH0csur02D_WwyKx5HUqWeug@mail.gmail.com" type="cite">
<div class="gmail_extra"><br>
<div class="gmail_quote">On Sat, Oct 11, 2014 at 11:37 PM, Filip
Pizlo <span dir="ltr"><<a moz-do-not-send="true" href="mailto:fpizlo@apple.com" target="_blank">fpizlo@apple.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
<br>
> On Oct 10, 2014, at 6:24 PM, Hayden Livingston <<a moz-do-not-send="true" href="mailto:halivingston@gmail.com">halivingston@gmail.com</a>>
wrote:<br>
><br>
> Hello,<br>
><br>
> I was wondering if there is an example list somewhere
of whole program optimizations done by LLVM based
compilers?<br>
><br>
> I'm only familiar with method-level optimizations,
and I'm being told wpo can deliver many great speedups.<br>
><br>
> My language is currently staticly typed JIT based and
uses the JVM, and I want to move it over to LLVM so that I
can have options where it can be ahead of time compiled as
well.<br>
<br>
</span>As Philip kindly pointed out, WebKit uses llvm as
part of a JavaScript JIT optimization pipeline. It works
well for WebKit, but this was a large amount of work. It may
not be the path of least resistance depending on what your
requirements are.<br>
<span class=""><br>
><br>
> I'm hearing bad things about LLVM's JIT capabilities
-- specifically that writing your own GC is going to be a
pain.<br>
<br>
</span>This is a fun topic and you'll probably get some good
advice. :-)<br>
<br>
Here's my take. GC in llvm is only a pain if you make the
tragic mistake of writing an accurate-on-the-stack GC.
Accurate collectors are only known to be beneficial in niche
environments, usually if you have an aversion to
probabilistic algorithms. You might also be stuck requiring
accuracy if your system relies on being able to force
*every* object to *immediately* move to a new location, but
this is an uncommon requirement - usually it happens due to
certain speculative optimization strategies in dynamic
languages.<br>
<br>
My approach is to use a Bartlett-style mostly-copying
collector. If you use a Bartlett-style collector then you
don't need any special support in llvm. It just works, it
allows llvm to register-allocate pointers at will, and it
lends itself naturally to high-throughput collector
algorithms. Bartlett-style collectors come in many shapes
and sizes - copying or not, mark-region or not, generational
or not, and even a fancy concurrent copying example exists.<br>
<br>
WebKit used a Bartlett-style parallel generational
sticky-mark copying collector with opportunistic mark-region
optimizations. We haven't written up anything about it yet
but it is all open source.<br>
<br>
Hosking's paper about the concurrent variant is here: <a moz-do-not-send="true" href="http://dl.acm.org/citation.cfm?doid=1133956.1133963" target="_blank">http://dl.acm.org/citation.cfm?doid=1133956.1133963</a><br>
<br>
I highly recommend reading Bartlett's original paper about
conservative copying; it provides an excellent semi space
algorithm that would be a respectable starting point for any
VM. You won't regret implementing it - it'll simplify your
interface to any JIT, not just llvm. It'll also make FFI
easy because it allows the C stack to refer directly to GC
objects without any shenanigans.<br>
<br>
Bartlett is probabilistic in the sense that it may, with low
probability, increase object drag. This happens rarely. On
64-bit systems it's especially rare. It's been pretty well
demonstrated that Bartlett collectors are as fast as
accurate ones, insofar as anything in GC land can be
demonstrated (as in it's still a topic of lively debate,
though I had some papers back in the day that showed some
comparisons). WebKit often wins GC benchmarks for example,
and we particularly like that our GC never imposes
limitations on llvm optimizations. It's really great to be
able to view the compiler and the collector as orthogonal
components!<br>
<span class="im HOEnZb"><br>
><br>
> Anyways, sort of diverged there, but still looking
for WPO examples!<br>
><br>
> Hayden.<br>
</span>
<div class="HOEnZb">
<div class="h5">>
_______________________________________________<br>
> LLVM Developers mailing list<br>
> <a moz-do-not-send="true" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>
<a moz-do-not-send="true" href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
> <a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a moz-do-not-send="true" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>
<a moz-do-not-send="true" href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a class="moz-txt-link-freetext" href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a>
<a class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>
</pre>
</blockquote>
<br>
</div></blockquote></body></html>