[PATCH] Statepoint infrastructure for garbage collection
Sean Silva
chisophugis at gmail.com
Thu Oct 9 13:31:25 PDT 2014
+Filip Pizlo
On Wed, Oct 8, 2014 at 2:24 PM, Philip Reames <listmail at philipreames.com>
wrote:
> Hi hfinkel, chandlerc, nicholas, sanjoy, atrick, ributzka, theraven,
>
> The attached patch implements an approach to supporting garbage collection
> in LLVM that has been mentioned on the mailing list a number of times by
> now. There's a couple of issues that need to be addressed before
> submission, but I wanted to get this up to give maximal time for review.
>
> The statepoint intrinsics are intended to enable precise root tracking
> through the compiler as to support garbage collectors of all types. Our
> testing to date has focused on fully relocating collectors (where pointers
> can change at any safepoint poll, or call site), but the infrastructure
> should support collectors of other styles. The addition of the statepoint
> intrinsics to LLVM should have no impact on the compilation of any program
> which does not contain them. There are no side tables created, no extra
> metadata, and no inhibited optimizations.
>
> A statepoint works by transforming a call site (or safepoint poll site)
> into an explicit relocation operation. It is the frontend's responsibility
> (or eventually the safepoint insertion pass we've developed, but that's not
> part of this patch) to ensure that any live pointer to a GC object is
> correctly added to the statepoint and explicitly relocated. The relocated
> value is just a normal SSA value (as seen by the optimizer), so merges of
> relocated and unrelocated values are just normal phis. The explicit
> relocation operation, the fact the statepoint is assumed to clobber all
> memory, and the optimizers standard semantics ensure that the relocations
> flow through IR optimizations correctly.
>
> During the lowering process, we currently spill aggressively to stack.
> This is not entirely ideal (and we have plans to do better), but it's
> functional, relatively straight forward, and matches closely the
> implementations of the patchpoint intrinsics. We leverage the existing
> StackMap section format, which is already used by the patchpoint
> intrinsics, to report where pointer values live. Unlike a patchpoint,
> these locations are known (by the backend) to be writeable during the
> call. This enables the garbage collector to transparently read and update
> pointer values if required. We do optimize lowering in certain well known
> cases (constant pointers, a.k.a. null, being the key one.)
>
> There are a few areas of this patch which could use improvement:
> - The patch needs rebased against TOT. It's currently based against a
> roughly 3 week old snapshot.
> - The intrinsics should probably be renamed to include an "experimental"
> prefix.
> - The usage of Direct and Indirect location types are currently inverted
> as compared to the definition used by patchpoint. This is a simple fix.
> - The test coverage could be improved. Most of the tests we've actually
> been using are built on top of the safepoint insertion mechanism (not
> included here) and our runtime. We need to improve the IR level tests for
> optimizer semantics (i.e. not doing illegal transforms), and lowering.
> There are some minimal tests in place for the lowering of simple
> statepoints.
> - The documentation is "in progress" (to put it kindly.)
> - Many functions are missing doxygen comments
> - There's a hack in to force the use of RSP+Offset addressing vs
> RBP-Offset addressing for references in the StackMap section. This works,
> shouldn't break anyone else, but should definitely be cleaned up. The
> choice of addressing preference should be up to the runtime.
>
> When reviewing, I would greatly appreciate feedback on which issues need
> to be fixed before submission and those which can be addressed afterwards.
> It is my plan to actively maintain and enhance this infrastructure over
> next few months (and years). It's already been developed out of tree
> entirely too long (our fault!), and I'd like to move to incremental work in
> tree as quickly as feasible.
>
> Planned enhancements after submission:
> - The ordering of arguments in statepoints is essentially historical cruft
> at this point. I'm open to suggestions on how to make this more
> approachable. Reordering arguments would (preferably) be a post commit
> action.
> - Support for relocatable pointers in callee saved registers over call
> sites. This will require the notation of an explicit relocation psuedo op
> and support for it throughout the backend (particularly the register
> allocator.)
> - Optimizations for non-relocating collectors. For example, the clobber
> semantics of the spill slots aren't needed if the collector isn't
> relocating roots.
> - Further optimizations to reduce the cost of spilling around each
> statepoint (when required at all).
> - Support for invokable statepoints.
> - Once this has baked in tree for a while, I plan to delete the existing
> gc_root code. It is unsound, and essentially unused.
>
> In addition to the enhancements to the infrastructure in the currently
> proposed patch, we're also working on a number of follow up changes:
> - Verification passes to confirm that safepoints were inserted in a
> semantically valid way (i.e. no memory access of a value after it has been
> inserted)
> - A transformation pass to convert naive IR to include both safepoint
> polling sites, and statepoints on every non-leaf call. This transformation
> pass can be used at initial IR creation time to simplify the frontend
> authors' work, but is also designed to run on *fully optimized* IR,
> provided the initial IR meets certain (fairly loose) restrictions.
> - A transformation pass to convert normal loads and stores into user
> provided load and store barriers.
> - Further optimizations to reduce the number of safepoints required, and
> improve the infrastructure as a whole.
>
> We've been working on these topics for a while, but the follow on patches
> aren't quite as mature as what's being proposed now. Once these pieces
> stabilize a bit, we plan to upstream them as well. For those who are
> curious, our work on those topics is available here:
> https://github.com/AzulSystems/llvm-late-safepoint-placement
>
> http://reviews.llvm.org/D5683
>
> Files:
> docs/Statepoints.rst
> include/llvm/CodeGen/FunctionLoweringInfo.h
> include/llvm/CodeGen/MachineInstr.h
> include/llvm/CodeGen/StackMaps.h
> include/llvm/IR/Intrinsics.td
> include/llvm/IR/Statepoint.h
> include/llvm/Target/Target.td
> include/llvm/Target/TargetFrameLowering.h
> include/llvm/Target/TargetOpcodes.h
> lib/Analysis/TargetTransformInfo.cpp
> lib/CodeGen/InlineSpiller.cpp
> lib/CodeGen/LocalStackSlotAllocation.cpp
> lib/CodeGen/PrologEpilogInserter.cpp
> lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
> lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
> lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
> lib/CodeGen/StackMaps.cpp
> lib/CodeGen/TargetLoweringBase.cpp
> lib/IR/CMakeLists.txt
> lib/IR/Function.cpp
> lib/IR/LLVMContext.cpp
> lib/IR/Statepoint.cpp
> lib/IR/Verifier.cpp
> lib/Target/X86/X86FrameLowering.cpp
> lib/Target/X86/X86FrameLowering.h
> lib/Target/X86/X86ISelLowering.cpp
> lib/Target/X86/X86MCInstLower.cpp
> lib/Transforms/InstCombine/InstCombineCalls.cpp
> test/CodeGen/X86/statepoint-call-lowering.ll
> test/CodeGen/X86/statepoint-stack-usage.ll
> test/CodeGen/X86/statepoint-stackmap-format.ll
> test/Verifier/statepoint-non-gc-ptr.ll
> test/Verifier/statepoint.ll
> utils/TableGen/CodeGenTarget.cpp
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141009/82daf10a/attachment.html>
More information about the llvm-commits
mailing list