[llvm-dev] RFC: Strong GC References in LLVM

Mon Jul 11 14:28:51 PDT 2016

ping!

Sanjoy Das wrote:
> This is a proposal to add strong GC reference types to LLVM.
>
> We have some local (downstream) patches that are needed to prevent
> LLVM's optimizer from making transforms that are problematic in the
> presence of a precise relocating GC.  Adding a notion of a strong GC
> reference to LLVM will let us upstream these patches in a principled
> manner, and will act as a measure to avoid new problematic transforms.
>
> The word "strong" here is used as opposed to the weaker pointers that
> LLVM currently has which can be converted to and from integers without
> repercussions (i.e. inttoptr and ptrtoint are nop-like non-sideeffecting
> instructions).
>
> Let me emphasize first that this email is intended to *start* a
> conversation; and all of the points made here are open for discussion
> and debate.  This proposal is informed by only _our_ GC and runtime
> and other runtimes may have different requirements.
>
> # Preamble&  "Problem Statement"
>
> A quick introduction to some GC basics: a precise relocating garbage
> collector needs co-operation from the code generator / compiler to be
> able to scan and relocate objects pointed to from frame local state
> like CPU registers and stack slots.  Specifically, it needs a list of
> the locations of all the GC references present in a stack frame; and
> at "safepoints" it stops "mutator" threads (via some mechanism not
> pertinent here), traverses all of the stack frames and uses the
> aforementioned list to scan and potentially rewrite those references
> to point to the new locations for the same objects.  In addition, some
> garbage collectors need read and write barriers -- these are small
> bits of code that run when reading and writing GC references (distinct
> from memory reordering barriers).  For more details on GC specific
> stuff, I suggest referring to http://www.memorymanagement.org/ and our
> 2014 LLVM-Dev talk http://llvm.org/devmtg/2014-10/#talk4.
>
> Getting LLVM ToT working with a precise relocating GC involves taking
> the following steps:
>
>      Note: GCREF is the type representing a pointer that the GC needs
>        to know about and potentially relocate.  Currently it is `<Ty>
>        addrspace(1)*`.
>
>   1. Generate IR, and represent GC pointer SSA values as values of type
>      GCREF.
>   2. Run opt -O3.
>   3. Run RewriteStatepointForGC.  It rewrites IR values of type GCREF
>      to "normal" pointers that the backend can lower (note -- we don't
>      physically rewrite the llvm::Types to keep things simple and fast).  It
>      does this by making all of the the data flow required to express
>      relocation semantics explicit in the IR.
>   4. Rewrite loads and stores of GCREF type to execute load and store
>      barriers (logically can be part of (3)).
>   5. Lower into machine code.
>
> For this discussion let's treat the internals of (3), (4) and (5) as
> black boxes, and center our discussion on what properties we require
> of the IR after step (2) so that step (3), (4) and (5) can do their
> job.  However, if you think the overall design can be simplified by
> changing (3), (4) or (5), please feel free to point that out. :)
>
> After (2) and before (3), we want to have the following invariant:
>
> '''
> Given any location "L" in the CFG of a function, let the set of all
> values of type GCREF whose defs dominate "L" be "G":
>
>    A. "G" is a superset of the references that need to be reported to
>       the GC if we put a safepoint at "L".  References need to be
>       reported to the GC if the GC may want to scan through them or
>       rewrite them to point to the new locations of the objects they
>       used to point to.
>
>    B. Everything in "G" is a valid GC reference.  This directly implies
>       that all values of type GCREF hold a valid GC reference (since
>       "L" could be the location immediately following the def).
> '''
>
> Few important things to note about "valid GC reference":
>
>   1. There is no fixed notion of a valid GC reference.  What
>      constitutes a valid GC reference can change at arbitrary points in
>      time.
>
>   2. There may not be a runtime bitwise predicate that can reliably
>      distinguish a "valid" GC reference from an invalid one.  The
>      validity of a GC reference is a function of its provenance as well
>      as its bitwise content.
>
>   3. (Related to the previous point) Arbitrary GEPs off of "valid GC
>      references" are also "valid GC references".  This includes out of
>      bounds GEPs (whose bitwise representation can have any value, for
>      all practical intents and purposes).
>
>
> [Aside: We could phrase the invariant in terms of fixed safepoint
>   locations, but I think focusing on this (stronger) invariant actually
>   makes the problem simpler (and no harder).]
>
> The invariant is somewhat vague and relies on white-box knowledge of
> the GC to define what values "need to be reported to the GC" (this
> will become clearer as we see some examples); and the "problem
> statement" is to come up with a precise set of rules governing the
> GCREF type so that the invariants we need fall out.  Today we enforce
> the invariants downstream by `#if 0` ing out code that breaks it.
> However that's fragile; and more importantly, it does not help other
> people interested in precise GC and LLVM.
>
>
>
> # Examples of Bad Transforms
>
> Let's look at some some examples of bad transforms that are legal
> today (in upstream LLVM) with GCREF == "addrspace(1)*".  Right now
> we'll justify their "badness" by invoking ad-hoc GC specific rules on
> a case by case basis:
>
> ## Integer ->  GCREF Conversion
>
> (%value is unused)
>
> %value = load i64, i64* %ptr
> == Transformed to ==>
> %value = load GCREF, GCREF* (bitcast %ptr)
>
> This is a problem since we've introduced a value of type GCREF that
> may not be a valid GC reference at runtime.  If the compiler can prove
> that %value is actually a valid GC reference then this transform is
> valid, but typically it won't be able to except in cases like "%value
> is provably 0" in which case it is better to just fold the load away.
>
> %value = load i64, i64* %ptr
> == Transformed to ==>
> %value = load i64, i64* %ptr
> %value_gc = inttoptr %value to GCREF  (%value_gc unused)
>
> is invalid for the same reason.
>
>
> ## GCREF ->  Integer conversion
>
> Converting a GCREF to an integer is fine if the integer is unused.
> However, cases like these are problematic:
>
> %v0 = load GCREF, GCREF* %ptr_0
> %v1 = load GCREF, GCREF* %ptr_1
> %cmp = icmp eq %v0, %v1
> == Transformed to ==>
> %v0 = load i64, i64* %ptr_0
> %v1 = load i64, i64* %ptr_1
> %cmp = icmp eq %v0, %v1
>
> Since in the second version if "L" is, say, between %v0 and %v1, "G"
> would not include "v0" which contains a GC reference that the GC may
> want to scan or relocate[0].
>
> For similar reasons
>
> %v0 = load GCREF, GCREF* %ptr_0
> %v1 = load GCREF, GCREF* %ptr_0
> %cmp = icmp eq %v0, %v1
> == Transformed to ==>
> %v0 = load GCREF, GCREF* %ptr_0
> %v0.i = ptrtoint %v0
> %v1 = load GCREF, GCREF* %ptr_0
> %v1.i = ptrtoint %v1
> %cmp = icmp eq %v0.i, %v1.i
>
> is also invalid.
>
> ## Round tripping GCREF's through Integers
>
> The following transform is invalid:
>
> %v0 = load GCREF, GCREF* %loc0
> store GCREF %v0, GCREF* %loc1
> == Transformed to ==>
> %v0 = load i64, i64* (bitcast %loc0)
> store i64 %v0, i64* (bitcast %loc1)
>
> because if "L" in the second version is between the load and the
> store, "G" will not include `%v0` even though we need to report it to
> the GC (otherwise we could propagate a stale bitwise representation of
> the GC reference we loaded in `%v0`).
>
> ## Bad types due to hoisting loads out of control flow
>
> This is bad
>
> if (is_reference) {
>    %val = load GCREF, GCREF* %ptr
> }
> == Transformed to ==>
> %val = load GCREF, GCREF* %ptr
> if (is_reference) {
> }
>
> unless the compiler can prove that %val is a GC reference at its new
> location.  Downstream we model the Java type system in LLVM IR to try
> to prove that %val is a GC reference in its new location, but for
> starters we will have to be conservative upstream.
>
>
>
> # Proposed Solution:
>
> We introduce a "new" LLVM type.  I will still refer to it as GCREF
> here, but it may actually still be "<ty>  addrspace(k)*" where k is
> specially noted in the datalayout.
>
> Semantics:
>
>   1. GCREF represents an equivalence class of values (equivalence
>      relation being "points to a fixed semantic object").  The bitwise
>      representation fluctuates constantly outside the compiler's
>      control (the dual of `undef`), but may have invariants (in
>      particular, we'd like to be able to specify alignment, nonnull
>      etc.).  At any given point in time all GCREF instances pointing to
>      the same object have the same bitwise representation (we need this
>      to make `icmp eq` is well-defined).
>
>   2. GCREF instances can only contain a valid gc reference (otherwise
>      they can't meaningfully "fluctuate" among the various possible
>      bitwise representations of a reference).
>
>   3. Converting GCREF to integers is fine in general, but you'll get an
>      arbitrary "snapshot" of the bitwise value that will generally not
>      be meaningful (unless you are colluding with the GC in
>      implementation defined ways).
>
>   4. Converting integers to GCREF is allowed only if source integer is
>      a bitwise representation of a valid GC reference that is not an
>      out of bounds derived reference.  However, this is difficult for
>      the compiler to infer since it typically will have no fundamental
>      knowledge of what bitwise representation can be a valid GC
>      reference.
>
>   5. Operations that use a GCREF-typed value are "atomic" in using the
>      bitwise representation, i.e., loading from a GCREF typed value
>      does not "internally" convert the GCREF to a normal
>      integer-pointer and then use the integer-pointer, since that would
>      mean there is a window in which the integer-pointer can become
>      stale[1].
>
>   6. A GCREF stored to a location in the heap continues to fluctuate,
>      and keeps itself in sync with the right bitwise representation.
>      In a way, there isn't a large distinction between the GC and the
>      heap -- the heap is part of (or managed by) the GC.
>
> I think (6) is the most controversial of the semantics above, but it
> isn't very different from how `undef` stored to the heap remains
> `undef` (i.e. a non-deterministic N-bit value) and a later load can
> recover `undef` instead of getting a normal N-bit value.
>
>
> ## How do we fare with the examples?
>
> Let's take a look some of the examples, and see how the semantics
> specified above help us:
>
> ### Demoting loads / stores from GCREF to integers is invalid (case 1):
>
> %v0 = load GCREF, GCREF* %ptr_0
> %v1 = load GCREF, GCREF* %ptr_1
> %cmp = icmp eq %v0, %v1
> ==>
> %v0 = load i64, i64* %ptr_0
> %v1 = load i64, i64* %ptr_1
> %cmp = icmp eq %v0, %v1
>
> Invalid because the initial IR is comparing two GC fluctuating
> references, while the the transformed IR will compare their bitwise
> snapshot taken at an arbitrary point in time.
>
>
> ### Demoting loads / stores from GCREF to integers is invalid (case 2):
>
> %v0 = load GCREF, GCREF* %loc0
> store GCREF %v0, GCREF* %loc1
> ==>
> %v0 = load i64, i64* (bitcast %loc0)
> store i64 %v0, i64* (bitcast %loc1)
>
> Invalid because there are two semantic differences between the initial
> and final IR:
>
>   1. In the initial IR we store a fluctuating and up-to-date GCREF,
>      whereas in the final IR we store a potentially stale bitwise
>      snapshot of the value.
>   2. In the initial IR, after the store *%loc1 has a fluctuating GC
>      reference, while in the final IR *%loc1 only has a once-valid
>      snapshot of a GC reference (that will no longer fluctuate).
>
>
> ### Basic store forwarding is valid:
>
> store  GCREF %v0, GCREF* %loc0
> %v1 = load GCREF, GCREF* %loc0
> ==>
> store GCREF %v0, GCREF* %loc0
> %v1 = %v0
>
> The store forwarding is valid since we stored a GCREF that
> semantically points to object O (say), and loaded back a GCREF
> pointing to the same object O.
>
>
> ### Hoisting inttoptr is (generally) invalid:
>
>    if (<condition>) {
>      %val_gc = ptrtoint %val to GCREF
>    }
> ==>
>    %val_gc = ptrtoint %val to GCREF
>    if (<condition>) {}
>
> is invalid since we do not know that `%val` is a valid bitwise
> representation of a GCREF at the point in time we do the `ptrtoint`
> (i.e.<<condition>>  could be controlling whether `%val` is a valid
> bitwise representation of a GCREF).
>
> In theory this isn't different from loads, divisions or any other
> potentially trapping operation, but I think a lot of LLVM directly
> encodes the fact that `CastInst` s are obviously speculatable, so
> maybe Integer ->  GCREF casts should be outlawed at the type system
> level, and be done via an intrinsic?
>
>
> ### Store forwarding between integers and GCREFs is valid
>
> store i64 %val, i64* %loc
> %val = load GCREF, GCREF* (bitcast %loc)
> ==>
> store i64 %val, i64* %loc
> %val = inttoptr %val to GCREF
>
> is valid.
>
> Both the initial and the final program assume that %val is a valid
> bitwise representation of a GC reference.
>
> (Note: it may not be valid to speculate the inttoptr instruction above
> control flow).
>
> ### Load forwarding between integers and GCREFs is invalid
>
> %val_i = load i64, i64* %loc
> %val_p = load GCREF, GCREF* (bitcast %loc)
> ==>
> %val_i = load i64, i64* %loc
> %val_p = inttoptr %val_i to GCREF
>
> is invalid if *%loc contains a GC reference.  Given the model that we
> have, the first program loads some bitwise representation of
> fluctuating a GC reference, and then loads the same GC reference as a
> GCREF; whereas the second program tries to convert a potentially stale
> bitwise representation of a GC reference into a fluctuating GCREF
> (which is not allowed).
>
> ## Implementation Issues&  Impact to LLVM
>
> This is a proposed path forward to implementing GCREF in LLVM.  The
> caveat mentioned at the start of the email (everything is open for
> discussion) especially applies here.
>
> ### Changes to IR to represent GCREF
>
> We change `datalayout` to denote pointers of a specific address space
> as GCREFs.  We add verifier rules forbidding `inttoptr` conversion of
> integers into GCREFs and add an intrinsic (that can be control
> dependent) that converts integers to GCREFs.
>
> ### Changes to the optimizer
>
> We change the optimizer to
>
>   - Not speculate loads of GCREF type
>   - Not change uses of GCREF types to uses of non-GCREF types
>
> Locally we already have changes fixing most of the places that'd need
> to be fixed for the above to hold.
>
> ### Lowering
>
> We do not teach LLC about GCREF's at all (in fact, we can change it to
> assert that the datalayout does not contain a `gc` directive).  We
> make the RewriteStatepointForGC pass rewrite the IR to make
> relocations explicit (so that no more GCREF-like semantics are
> necessary) and remove the GC directive from the Module's datalayout.
>
>
> Thoughts?
>
> -- Sanjoy
>
>
> Footnotes:
>
> [0]:
> To see why this is bad assume to begin with %ptr_0 and %ptr_1 both
> contain a GC reference pointing to object X, currently at address
> 0x500.
>
> Now the following happens
>
> %v0 = load i64, i64* %ptr_0
> %v0 is 0x500
> <<  safepoint>>  GC moves X to 0x600, and update *%ptr_1
> %v1 = load i64, i64* %ptr_1
> %v1 = 0x600
> %cmp = icmp eq %v0, %v1
> %cmp = false, when it should be true
>
> If %v0 was "reported to the GC" in<<safepoint>>  then it would have
> been updated in place to 0x600, and %cmp would evaluate to true, as
> expected.
>
>
> [1]:
> In reality, this is handled by _not_ putting safepoints in problematic
> locations.  For instance, if lowering a CompareAndSwap in some weird
> architecture would involve doing:
>
>    // CAS(From, To, Location)
>    Location ^=<<  Constant>>
>
>    // The GC cannot parse the state at this location "L", since we've
>    // done some arbitrary arithmetic on Location.
>
>    ... Do some magic with Location, To, From
>
> then we _cannot_ put a safepoint at location "L".