[LLVMdev] Future plans for GC in LLVM
Philip Reames
listmail at philipreames.com
Thu Dec 4 17:50:14 PST 2014
Now that the statepoint changes have landed, I wanted to start a
discussion about what's next for GC support in LLVM. I'm going to
sketch out a strawman proposal, but I'm not set on any of this. I
mostly just want to draw interested parties out of the woodwork. :)
Overall Direction:
In the short term, my intent is to preserve the functionality of the
existing code, but migrate towards a position where the gcroot specific
pieces are optional and well separated. I also plan to start updating
the documentation to reflect a separation between the general support
for garbage collection (function attributes, identifying references,
load and store barrier lowering, generating stack maps) and the
implementation choices (gcroot & it's lowering vs statepoints & addr
spaces for identifying references).
Longer term, I plan to *EVENTUALLY DELETE* the existing gcroot lowering
code and in tree GCStrategies unless an interesting party speaks up. I
have no problem with retaining some of the existing pieces for legacy
support or helping users to migrate, but as of right now, I don't know
of any such active users. The only exception to this might be the
shadow stack GC. Eventually in this context is at least six months from
now, but likely less than 18 months. Hopefully, that's vague enough. :)
HELP - If anyone knows which Ocaml implementation and which Erlang
implementation triggered the in tree GC strategies, please let me know!
Near Term Changes:
- Migrate ownership of GCStrategy objects from GCModuleInfo to
LLVMContext. In theory, this looses the ability for two different
Modules to have the same collector with different state, but I know of
no use case for this.
- Modify the primary Function::getGC/setGC interface to return a
reference the GCStrategy object, not a string. I will provide a
Function::setGCString and getGCString.
- Extend the GCStrategy class to include a notion of which compilation
strategy is being used. The two choices right now will be Legacy and
Statepoint. (Longer term, this will likely become a more fine grained
choice.)
- Separate GCStategy and related pieces from the
GCFunctionInfo/GCModuleInfo/GCMetadataPrinter lowering code. At first,
this will simply mean clarifying documentation and rearranging code a bit.
- Document/clarify the callbacks used to customize the lowering. Decide
which of these make sense to preserve and document.
(Lest anyone get the wrong idea, the above changes are intended to be
minor cleanup. I'm not looking to do anything controversial yet.)
Questions:
- Is proving the ability to generate a custom binary stack map format a
valuable feature? Adapting the new statepoint infrastructure to work
with the existing GCMetadataPrinter classes wouldn't be particularly hard.
- Are there any GCs out there that need gcroot's single stack slot per
value implementation? By default, statepoints may generate a different
stackmap for every safepoint in a function.
- Is using gcroot and allocas to mark pointers as garbage collected
references valuable? (As opposed to using address spaces on the SSA
values themselves?) Long term, should we retain the gcroot marker
intrinsics at all?
Philip
Appendix: The Current Implementations Key Classes:
GCStrategy - Provides a configurable description of the collector. The
strategy can also override parts of the default GC root lowering
strategy. The concept of such a collector description is very valuable,
but the current implementation could use some cleanup. In particular,
the custom lowering hooks are a bit of a mess.
GCMetadataPrinter - Provides a means to dump a custom binary format
describing each functions safepoints. All safepoints in a function must
share a single root Value to stack slot mapping.
GCModuleInfo/GCFunctionInfo - These contain the metadata which is saved
to enable GCMetadataPrinter.
More information about the llvm-dev
mailing list