[LLVMdev] Future plans for GC in LLVM

Tue Dec 9 03:12:41 PST 2014

Hello Philip,

I am an active user of LLVM's GC infrastructure. Here are some notes on how
I currently use gcroot():

1) I use several IRs. A high-level (CPS/SSA hybrid) IR is the target of
inlining and other such high-level optimization; most allocation is
implicit in the high level IR. Closure conversion/lambda-lifting produces a
variant of that IR, extended with explicit allocation constructs. A
separate pass introduces the allocations themselves.
2) Next, a dataflow-based root insertion pass inserts the minimal(ish) set
of roots and root reloads to preserve the invariant that every live GCable
pointer is in a root at each potential GC point. The root insertion pass
also computes liveness and uses it to minimize the set of roots via slot
coloring. The result of root insertion is (very close to) LLVM IR.
3) Potential GC points are determined inter-procedurally, and call sites
which are statically known to (transitively) not GC do not break GCable
pointer live ranges.
4) I use a lightly-modified variant of the OCaml plugin's stackmap format.
5) I don't currently use load or store barriers, but do eventually plan to.
6) I do support GCing through global variables.
7) I wrote a custom LLVM pass to verify that values loaded from GC roots
aren't used across GC points.

When I first started using gcroot(), I used the naïve approach of reloading
roots after every call, and encountered significant overhead. The
optimizations sketched above significantly reduced that overhead.
Unfortunately, I don't know of any meaningful language-independent
benchmark suites for measuring that sort of overhead, and of course the
overhead on any given program will strongly depend on details of how that
program is written...

Also, FWIW when I first started, I got up and running with the shadow
stack, just as Gordon described, before transitioning to the "real"
infrastructure. As long as it doesn't carry a significant burden on the
implementation side, I think it's worth having, because it significantly
improves the learning curve for new users.

OK, direct answers to some of your questions:

* I think custom stack map formats have small but non-zero value. There
have been a few papers (most in the early 90's, I think) which showed that
stack maps can make up a non-trivial fraction of a binary's size, and thus
are increasingly desirable to optimize as programs grow larger. My verdict:
if support for custom formats are ever actively impeding forward progress,
toss 'em; otherwise, there should (eventually) be a more detailed look at
the costs and benefits.

* As above, I think the primary benefit of one-stackmap-per-safepoint is
saving space at the cost of time (and root traffic). AFAIK the primary
virtue of the one-stack-slot-per-gcroot() implementation is implementation
simplicity, nothing more.

* The ultimate strength & weakness of gcroot() is that basically everything
is left to the frontend. There's very little unavoidable overhead imposed
on a frontend that is willing to go the distance to generate non-naïve code
-- any knowledge that the frontend has can be used to generate better code.
Unfortunately, this also places a rather heavy burden on the frontend to
generate valid & efficient code.

Since I haven't had the chance to look beyond mailing list & blog posts on
statepoints, I can't comment much on their tradeoffs vs gcroots. The
high-level impression I get is that statepoints will allow a
less-sophisticated frontend to get better results than naïve usage of
gcroot(). It also looks like statepoints will do a better job of steering
frontends away from the pitfalls of using GC-invalidated pointer values.  I
have no idea whether there will be any codegen benefit for more
sophisticated frontends, but I'll be happy to port my implementation to
statepoints once they've gotten a chance to settle down somewhat, and
provide more detailed feedback to help determine what to eventually do with
gcroot(). I can currently only do ad-hoc benchmarking, but hopefully by the
time gcroot() is on the chopping block, I'll have a more extensive
automated performance regression test suite ;-)

One quick question: I think I understand why address spaces are needed to
distinguish GCable pointers for late safepoint placement, but are they also
needed if the frontend is inserting all safepoints itself?

Finally, thank you for taking on the burden of improving LLVM's GC
functionality!

On Thu, Dec 4, 2014 at 8:50 PM, Philip Reames <listmail at philipreames.com>
wrote:

> Now that the statepoint changes have landed, I wanted to start a
> discussion about what's next for GC support in LLVM.  I'm going to sketch
> out a strawman proposal, but I'm not set on any of this.  I mostly just
> want to draw interested parties out of the woodwork.  :)
>
> Overall Direction:
> In the short term, my intent is to preserve the functionality of the
> existing code, but migrate towards a position where the gcroot specific
> pieces are optional and well separated.  I also plan to start updating the
> documentation to reflect a separation between the general support for
> garbage collection (function attributes, identifying references, load and
> store barrier lowering, generating stack maps) and the implementation
> choices (gcroot & it's lowering vs statepoints & addr spaces for
> identifying references).
>
> Longer term, I plan to *EVENTUALLY DELETE* the existing gcroot lowering
> code and in tree GCStrategies unless an interesting party speaks up.  I
> have no problem with retaining some of the existing pieces for legacy
> support or helping users to migrate, but as of right now, I don't know of
> any such active users.  The only exception to this might be the shadow
> stack GC.  Eventually in this context is at least six months from now, but
> likely less than 18 months.  Hopefully, that's vague enough.  :)
>
> HELP - If anyone knows which Ocaml implementation and which Erlang
> implementation triggered the in tree GC strategies, please let me know!
>
>
> Near Term Changes:
> - Migrate ownership of GCStrategy objects from GCModuleInfo to
> LLVMContext.  In theory, this looses the ability for two different Modules
> to have the same collector with different state, but I know of no use case
> for this.
> - Modify the primary Function::getGC/setGC interface to return a reference
> the GCStrategy object, not a string.  I will provide a
> Function::setGCString and getGCString.
> - Extend the GCStrategy class to include a notion of which compilation
> strategy is being used.  The two choices right now will be Legacy and
> Statepoint.  (Longer term, this will likely become a more fine grained
> choice.)
> - Separate GCStategy and related pieces from the
> GCFunctionInfo/GCModuleInfo/GCMetadataPrinter lowering code.  At first,
> this will simply mean clarifying documentation and rearranging code a bit.
> - Document/clarify the callbacks used to customize the lowering. Decide
> which of these make sense to preserve and document.
>
> (Lest anyone get the wrong idea, the above changes are intended to be
> minor cleanup.  I'm not looking to do anything controversial yet.)
>
> Questions:
> - Is proving the ability to generate a custom binary stack map format a
> valuable feature?  Adapting the new statepoint infrastructure to work with
> the existing GCMetadataPrinter classes wouldn't be particularly hard.
> - Are there any GCs out there that need gcroot's single stack slot per
> value implementation?   By default, statepoints may generate a different
> stackmap for every safepoint in a function.
> - Is using gcroot and allocas to mark pointers as garbage collected
> references valuable?  (As opposed to using address spaces on the SSA values
> themselves?)  Long term, should we retain the gcroot marker intrinsics at
> all?
>
>
> Philip
>
> Appendix: The Current Implementations Key Classes:
>
> GCStrategy - Provides a configurable description of the collector. The
> strategy can also override parts of the default GC root lowering strategy.
> The concept of such a collector description is very valuable, but the
> current implementation could use some cleanup.  In particular, the custom
> lowering hooks are a bit of a mess.
>
> GCMetadataPrinter - Provides a means to dump a custom binary format
> describing each functions safepoints.  All safepoints in a function must
> share a single root Value to stack slot mapping.
>
> GCModuleInfo/GCFunctionInfo - These contain the metadata which is saved to
> enable GCMetadataPrinter.
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141209/e62c9552/attachment.html>