[llvm] r246103 - [docs][Statepoints] More on base pointers
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 26 16:13:36 PDT 2015
Author: reames
Date: Wed Aug 26 18:13:35 2015
New Revision: 246103
URL: http://llvm.org/viewvc/llvm-project?rev=246103&view=rev
Log:
[docs][Statepoints] More on base pointers
Expand the information on base pointers to include an example, the assumptions a collector is allowed to make, legal optimizations over gc.relocates, and the assumptions made by RewriteStatepointsForGC. This is the result of a recent conversation with folks from LLIC and the confusions that came to light therein.
Modified:
llvm/trunk/docs/Statepoints.rst
Modified: llvm/trunk/docs/Statepoints.rst
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/Statepoints.rst?rev=246103&r1=246102&r2=246103&view=diff
==============================================================================
--- llvm/trunk/docs/Statepoints.rst (original)
+++ llvm/trunk/docs/Statepoints.rst Wed Aug 26 18:13:35 2015
@@ -209,20 +209,49 @@ This example was taken from the tests fo
Base & Derived Pointers
^^^^^^^^^^^^^^^^^^^^^^^
-A base pointer is one which points to the base of an allocation (object). A
-derived pointer is one which is offset from a base pointer by some amount.
-When relocating objects, a garbage collector needs to be able to relocate each
-derived pointer associated with an allocation to the same offset from the new
-address.
-
-Derived pointers fall in to two categories:
- * "Interior derived pointers" remain within the bounds of the allocation
- they're associated with. As a result, the base object can be found at
- runtime provided the bounds of allocations are known to the runtime system.
- * "Exterior derived pointers" are outside the bounds of the associated object;
- they may even fall within *another* allocations address range. As a result,
- there is no way for a garbage collector to determine which allocation they
- are associated with at runtime and compiler support is needed.
+A "base pointer" is one which points to the starting address of an allocation
+(object). A "derived pointer" is one which is offset from a base pointer by
+some amount. When relocating objects, a garbage collector needs to be able
+to relocate each derived pointer associated with an allocation to the same
+offset from the new address.
+
+"Interior derived pointers" remain within the bounds of the allocation
+they're associated with. As a result, the base object can be found at
+runtime provided the bounds of allocations are known to the runtime system.
+
+"Exterior derived pointers" are outside the bounds of the associated object;
+they may even fall within *another* allocations address range. As a result,
+there is no way for a garbage collector to determine which allocation they
+are associated with at runtime and compiler support is needed.
+
+The ``gc.relocate`` intrinsic supports an explicit operand for describing the
+allocation associated with a derived pointer. This operand is frequently
+referred to as the base operand, but does not strictly speaking have to be
+a base pointer, but it does need to lie within the bounds of the associated
+allocation. Some collectors may require that the operand be an actual base
+pointer rather than merely an internal derived pointer. Note that during
+lowering both the base and derived pointer operands are required to be live
+over the associated call safepoint even if the base is otherwise unused
+afterwards.
+
+If we extend our previous example to include a pointless derived pointer,
+we get:
+
+.. code-block:: llvm
+
+ define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj)
+ gc "statepoint-example" {
+ %gep = getelementptr i8, i8 addrspace(1)* %obj, i64 20000
+ %token = call i32 (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj, i8 addrspace(1)* %gep)
+ %obj.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %token, i32 7, i32 7)
+ %gep.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %token, i32 7, i32 8)
+ %p = getelementptr i8, i8 addrspace(1)* %gep, i64 -20000
+ ret i8 addrspace(1)* %p
+ }
+
+Note that in this example %p and %obj.relocate are the same address and we
+could replace one with the other, potentially removing the derived pointer
+from the live set at the safepoint entirely.
GC Transitions
^^^^^^^^^^^^^^^^^^
@@ -486,9 +515,14 @@ Despite the typing of this as a generic
by a ``gc.statepoint`` is legal here.
The second argument is an index into the statepoints list of arguments
-which specifies the base pointer for the pointer being relocated.
+which specifies the allocation for the pointer being relocated.
This index must land within the 'gc parameter' section of the
-statepoint's argument list.
+statepoint's argument list. The associated value must be within the
+object with which the pointer being relocated is associated. The optimizer
+is free to change *which* interior derived pointer is reported, provided that
+it does not replace an actual base pointer with another interior derived
+pointer. Collectors are allowed to rely on the base pointer operand
+remaining an actual base pointer if so constructed.
The third argument is an index into the statepoint's list of arguments
which specify the (potentially) derived pointer being relocated. It
@@ -631,8 +665,18 @@ non references. Address space 1 is not
This pass can be used an utility function by a language frontend that doesn't
want to manually reason about liveness, base pointers, or relocation when
constructing IR. As currently implemented, RewriteStatepointsForGC must be
-run after SSA construction (i.e. mem2ref).
+run after SSA construction (i.e. mem2ref).
+RewriteStatepointsForGC will ensure that appropriate base pointers are listed
+for every relocation created. It will do so by duplicating code as needed to
+propagate the base pointer associated with each pointer being relocated to
+the appropriate safepoints. The implementation assumes that the following
+IR constructs produce base pointers: loads from the heap, addresses of global
+variables, function arguments, function return values. Constant pointers (such
+as null) are also assumed to be base pointers. In practice, this constraint
+can be relaxed to producing interior derived pointers provided the target
+collector can find the associated allocation from an arbitrary interior
+derived pointer.
In practice, RewriteStatepointsForGC can be run much later in the pass
pipeline, after most optimization is already done. This helps to improve
More information about the llvm-commits
mailing list