[PATCH] [C API] Wire up new GC intrinsics to the C API

Philip Reames listmail at philipreames.com
Wed Dec 31 11:17:32 PST 2014


I'll happily take the fixes for the documentation, but I specifically do 
not want to commit to the C API at this time.  I am expecting details of 
the arguments to change a bit in the near future and do not want to be 
supporting a legacy C API.  Correct me if I'm wrong, but I believe we 
usually try to maintain backwards compatibility with the C API.

Is there a way to add something to the C API while explicitly excepting 
it from any backwards compatibility guarantees?  I'd be fine with that.

Alternatively, if you wanted to wait until after the 3.6 cut date, I'm 
pretty sure we'd have the api hammered out by the time we cut 3.7.  (And 
there's never any backwards compat for unreleased TOT versions.)

p.s. I'm away from work for a few days, I'll take a closer look at the 
doc changes and likely commit them early next week.

Philip

On 12/31/2014 10:33 AM, Ramkumar Ramachandra wrote:
> Hi whitequark, reames,
>
> While at it, fix a few typos in the Statepoints documentation (I
> happened to be reading it).
>
> The ultimate objective is to make these intrinsics accessible from
> OCaml, ofcourse. Currently, I can get away with delcare-plus-call, but
> these are pretty.
>
> http://reviews.llvm.org/D6821
>
> Files:
>    docs/Statepoints.rst
>    lib/IR/Core.cpp
>
> Index: docs/Statepoints.rst
> ===================================================================
> --- docs/Statepoints.rst
> +++ docs/Statepoints.rst
> @@ -16,11 +16,11 @@
>   Overview
>   ========
>   
> -To collect dead objects, garbage collectors must be able to identify any references to objects contained within executing code, and, depending on the collector, potentially update them.  The collector does not need this information at all points in code - that would make the problem much harder - but only at well defined points in the execution known as 'safepoints'  For a most collectors, it is sufficient to track at least one copy of each unique pointer value.  However, for a collector which wishes to relocate objects directly reachable from running code, a higher standard is required.
> +To collect dead objects, garbage collectors must be able to identify any references to objects contained within executing code, and, depending on the collector, potentially update them.  The collector does not need this information at all points in code - that would make the problem much harder - but only at well-defined points in the execution known as 'safepoints'  For most collectors, it is sufficient to track at least one copy of each unique pointer value.  However, for a collector which wishes to relocate objects directly reachable from running code, a higher standard is required.
>   
> -One additional challenge is that the compiler may compute intermediate results ("derived pointers") which point outside of the allocation or even into the middle of another allocation.  The eventual use of this intermediate value must yield an address within the bounds of the allocation, but such "exterior derived pointers" may be visible to the collector.  Given this, a garbage collector can not safely rely on the runtime value of an address to indicate the object it is associated with.  If the garbage collector wishes to move any object, the compiler must provide a mapping for each pointer to an indication of its allocation.
> +One additional challenge is that the compiler may compute intermediate results ("derived pointers") which point outside of the allocation or even into the middle of another allocation.  The eventual use of this intermediate value must yield an address within the bounds of the allocation, but such "exterior derived pointers" may be visible to the collector.  Given this, a garbage collector can not safely rely on the runtime value of an address to indicate the object it is associated with.  If the garbage collector wishes to move any object, the compiler must provide a mapping, for each pointer, to an indication of its allocation.
>   
> -To simplify the interaction between a collector and the compiled code, most garbage collectors are organized in terms of two three abstractions: load barriers, store barriers, and safepoints.
> +To simplify the interaction between a collector and the compiled code, most garbage collectors are organized in terms of three abstractions: load barriers, store barriers, and safepoints.
>   
>   #. A load barrier is a bit of code executed immediately after the machine load instruction, but before any use of the value loaded.  Depending on the collector, such a barrier may be needed for all loads, merely loads of a particular type (in the original source language), or none at all.
>   #. Analogously, a store barrier is a code fragement that runs immediately before the machine store instruction, but after the computation of the value stored.  The most common use of a store barrier is to update a 'card table' in a generational garbage collector.
> @@ -35,7 +35,7 @@
>   #. identify which object each pointer relates to, and
>   #. potentially update each of those copies.
>   
> -This document describes the mechanism by which an LLVM based compiler can provide this information to a language runtime/collector and ensure that all pointers can be read and updated if desired.  The heart of the approach is to construct (or rewrite) the IR in a manner where the possible updates performed by the garbage collector are explicitly visible in the IR.  Doing so requires that we:
> +This document describes the mechanism by which an LLVM based compiler can provide this information to a language runtime/collector, and ensure that all pointers can be read and updated if desired.  The heart of the approach is to construct (or rewrite) the IR in a manner where the possible updates performed by the garbage collector are explicitly visible in the IR.  Doing so requires that we:
>   
>   #. create a new SSA value for each potentially relocated pointer, and ensure that no uses of the original (non relocated) value is reachable after the safepoint,
>   #. specify the relocation in a way which is opaque to the compiler to ensure that the optimizer can not introduce new uses of an unrelocated value after a statepoint. This prevents the optimizer from performing unsound optimizations.
> Index: lib/IR/Core.cpp
> ===================================================================
> --- lib/IR/Core.cpp
> +++ lib/IR/Core.cpp
> @@ -1735,6 +1735,34 @@
>     return (LLVMAttribute)PAL.Raw(AttributeSet::FunctionIndex);
>   }
>   
> +/*--.. GC intrinsics .......................................................--*/
> +
> +LLVMValueRef LLVMBuildGCStatepoint(LLVMBuilderRef B, LLVMValueRef ActualCallee,
> +				   LLVMValueRef *Args, unsigned nrArgs,
> +				   LLVMValueRef *DeoptArgs, unsigned nrDeoptArgs,
> +				   LLVMValueRef *GCArgs, unsigned nrGCArgs,
> +				   const char *Name) {
> +  return wrap(unwrap(B)->CreateGCStatepoint(unwrap(ActualCallee),
> +					    makeArrayRef(unwrap(Args), nrArgs),
> +					    makeArrayRef(unwrap(DeoptArgs), nrDeoptArgs),
> +					    makeArrayRef(unwrap(GCArgs), nrGCArgs),
> +					    Name));
> +}
> +
> +LLVMValueRef LLVMBuildGCResult(LLVMBuilderRef B, LLVMValueRef Statepoint,
> +			       LLVMTypeRef ResultType, const char *Name) {
> +  return wrap(unwrap(B)->CreateGCResult(unwrap<Instruction>(Statepoint),
> +					unwrap(ResultType), Name));
> +}
> +
> +LLVMValueRef LLVMBuildGCResult(LLVMBuilderRef B, LLVMValueRef Statepoint,
> +			       int BaseOffset, int DerivedOffset,
> +			       LLVMTypeRef ResultType, const char *Name) {
> +  return wrap(unwrap(B)->CreateGCRelocate(unwrap<Instruction>(Statepoint),
> +					  BaseOffset, DerivedOffset,
> +					  unwrap(ResultType), Name));
> +}
> +
>   /*--.. Operations on parameters ............................................--*/
>   
>   unsigned LLVMCountParams(LLVMValueRef FnRef) {
>
> EMAIL PREFERENCES
>    http://reviews.llvm.org/settings/panel/emailpreferences/




More information about the llvm-commits mailing list