[llvm-dev] RFC: Strong GC References in LLVM

Tue Jul 12 15:29:37 PDT 2016

> On Jul 11, 2016, at 11:21 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
> 
> 
> InstCombine today will transform
> 
>  %v = load i32*, i32** %ptr0
>  store i32* %v, i32** %ptr1
> 
> into
> 
>  %1 = bitcast i32** %ptr0 to i64*
>  %v1 = load i64, i64* %1, align 8
>  %2 = bitcast i32** %ptr1 to i64*
>  store i64 %v1, i64* %2, align 8

I don’t remember why this is important. I often don't know with InstCombine whether something is needed to expose IR optimization or an ISEL pattern (which should really be somehow denoted and only run in a later pass).

> >
> >> ## GCREF ->  Integer conversion
> >>
> >> Converting a GCREF to an integer is fine if the integer is unused.
> >> However, cases like these are problematic:
> >>
> >> %v0 = load GCREF, GCREF* %ptr_0
> >> %v1 = load GCREF, GCREF* %ptr_1
> >> %cmp = icmp eq %v0, %v1
> >> == Transformed to ==>
> >> %v0 = load i64, i64* %ptr_0
> >> %v1 = load i64, i64* %ptr_1
> >> %cmp = icmp eq %v0, %v1
> >>
> >> Since in the second version if "L" is, say, between %v0 and %v1, "G"
> >> would not include "v0" which contains a GC reference that the GC may
> >> want to scan or relocate[0].
> >>
> >> For similar reasons
> >>
> >> %v0 = load GCREF, GCREF* %ptr_0
> >> %v1 = load GCREF, GCREF* %ptr_0
> >> %cmp = icmp eq %v0, %v1
> >> == Transformed to ==>
> >> %v0 = load GCREF, GCREF* %ptr_0
> >> %v0.i = ptrtoint %v0
> >> %v1 = load GCREF, GCREF* %ptr_0
> >> %v1.i = ptrtoint %v1
> >> %cmp = icmp eq %v0.i, %v1.i
> >>
> >> is also invalid.
> >
> > Converting any GCREF to an integer is BAD, period. A single benign use could float below a safepoint. Imagine if you expanded card marks early before step #3!
> 
> Here I'm trying to constrain the optimizer -- if the frontend wants to
> insert ptrtoints and knows what it is doing (because it has GC
> specific knowledge), then that should be fine.

Sure, if the frontend has some way to guarantee that statepoint insertion can’t happen between ptrtoint and its uses. But IR passes should never generate a ptrtoint from a GCRref, regardless of how it is used. You want very simple rules for pass writers that are also verifiable. If the frontend wants to use ptrtoint on a GCRef it could use an instrinsic for that purpose that is exempt from the GCRef verifier.

> >> ## Bad types due to hoisting loads out of control flow
> >>
> >> This is bad
> >>
> >> if (is_reference) {
> >>   %val = load GCREF, GCREF* %ptr
> >> }
> >> == Transformed to ==>
> >> %val = load GCREF, GCREF* %ptr
> >> if (is_reference) {
> >> }
> >>
> >> unless the compiler can prove that %val is a GC reference at its new
> >> location.  Downstream we model the Java type system in LLVM IR to try
> >> to prove that %val is a GC reference in its new location, but for
> >> starters we will have to be conservative upstream.
> >
> > This is really abstract. Why would the load be hoisted if %ptr is not dereferenceable? Why would it be dereferenceable if GCRef is not valid?
> >
> > I think this has to do with type checks. You could have a valid object, but the GCRef field may only be valid under some type check.
> 
> Yes, in practice the "is_reference" check will be a type check.  You
> can end up with this from Java code that looks like:
> 
> class A {
>  long longField;  // Offset 16, say
> }
> 
> class B {
>  Object objField;  // Offset 16 also
> }
> 
> void f(A a) {
>  Object o = a;
> 
>  // Since a is of type A, we know that it is dereferenceable up to
>  // 16+8 == 24 bytes, but it does not mean that we can speculate
>  // the load of objField.
> 
>  // This is kind of silly as written, but this does happen after
>  // inlining.
> 
>  if (o instanceof B) {
>    // This is dead code, and we will often DCE it away, but we're
>    // not allowed to rely on that happening (before hoisting) for
>    // correctness.
>    Object obj = ((B) o).objField;
>    print(obj.hashCode());
>  }
> }
> 
> > It is definitely important to speculate these loads, but I agree that it needs to be done with special logic that understands GCRef.
> 
> > In particular, you can speculate one such load, but never a dependent load.
> 
> We can't typically speculate dependent loads, but that's because we
> can't (typically) prove dependent loads as loading from non-null
> locations.  E.g. in
> 
> class A {
>  Object o;
> }
> 
> class B {
>  A a;
>  void g() {
>    if (cond) {
>      this.a.o.hashCode();
>    }
>  }
> }
> 
> We could speculate the load of .o if this.a could be proved to be
> non-null.
> 
> > You could solve the dominance problem by introducing some garbage value on the ‘else’ path. But then your simple invariants don’t hold.
> 
> Not sure what you mean by "simple invariants don’t hold", but in the
> example above "void f(A a) {", putting junk on the else path does not
> help.

For that to work, you would have to know where potential statepoints might be, which is why it breaks the invariant that GCRefs are always valid, and why I proposed only doing this within the safepoint insertion pass.

> > This is optimization is probably best done contained in step #3, rewriting.
> 
> I don't think that's a good idea -- we'll have to replicate LICM
> inside RewriteStatepointsForGC, for instance, when we really want it
> to run early.  Without LICM of gc references we can't get rid of range
> checks in simple loops like
> 
>  for (int i = 0; i < this.a.length; i++)
>    this.a[i] = 0;

LICM isn’t speculation, unless its looking at the dereferenceable attribute and the loop has conditions.

I thought your proposed solution was to *never* speculate loads of GCRefs, which is strictly worse than what I’m suggesting.

> > Alternatively, you somehow make the dereferenceable attribute apply to individual fields. Hasn't some sort of “assume dereferenceable” intrinsic been proposed? You would still left with the problem that you want to speculate past one level of type check.
> 
> That's not sufficient for the "void f(A a) {" example -- the field at
> offset 16 is dereferenceable for "a", but it can't be loaded as a
> GCREF.

I think there is an important design question here, because I’ve also heard grumbling about LLVM’s lack of condition dependence tracking outside of the GCRef context.

+CC Daniel Berlin

If you can come up with an IR design for tracking your GCRef load depedencies (usually on specific null and type checks), then that could solve some of the other difficult problems that others are wrestling with (sorry I don’t have a pointer to the discussions). This probably means inserting some cast-to-dereferenceable instruction on the address generation def-use chain.

Otherwise, you have to either impose some subtle rules on IR optimization passes, or make all GCRef loads intrinsics, which is also a maintenance burden.

> >> # Proposed Solution:
> >>
> >> We introduce a "new" LLVM type.  I will still refer to it as GCREF
> >> here, but it may actually still be "<ty>  addrspace(k)*" where k is
> >> specially noted in the datalayout.
> >
> > Sounds good to make this target-configurable.
> >
> >> Semantics:
> >>
> >> 1. GCREF represents an equivalence class of values (equivalence
> >>     relation being "points to a fixed semantic object").  The bitwise
> >>     representation fluctuates constantly outside the compiler's
> >>     control (the dual of `undef`), but may have invariants (in
> >>     particular, we'd like to be able to specify alignment, nonnull
> >>     etc.).  At any given point in time all GCREF instances pointing to
> >>     the same object have the same bitwise representation (we need this
> >>     to make `icmp eq` is well-defined).
> >
> > OK.
> >
> >> 2. GCREF instances can only contain a valid gc reference (otherwise
> >>     they can't meaningfully "fluctuate" among the various possible
> >>     bitwise representations of a reference).
> >
> > Lost me.
> 
> The "runtime" comes in and modifies values of type GCREF to contain
> different bitwise representations pointing to the same logical object.
> It can't do this rewriting if a GCREF value contains some random
> bitwise pattern, since that won't point to a valid object on the heap.

Ok, but you also said the derived GCRef can be any bit pattern due to out-of-bounds addressing. I think there is a constraint that the offset from the GC root can be determined via a straightforward analysis.

> 
> >
> >> 4. Converting integers to GCREF is allowed only if source integer is
> >>     a bitwise representation of a valid GC reference that is not an
> >>     out of bounds derived reference.  However, this is difficult for
> >>     the compiler to infer since it typically will have no fundamental
> >>     knowledge of what bitwise representation can be a valid GC
> >>     reference.
> >
> > I don’t know about this. It seems like the integer’s lifetime could be extended across a safepoint.
> 
> I've explicitly not talked about safepoints at all here -- GCREF
> values always have to hold "valid" references, and what a valid
> reference is changes at arbitrary points in time (and at this layer
> there are no "safepoints" denoting the only places where the
> definition of a valid reference can change).  This means, in general,
> you can not for sure say that an integer contains a bitwise value that
> is a valid GC reference, and even if you have one at a certain point
> in time that it will stay valid in the next instant.

Well, it would be much much easier to understand and verify if inttoptr were simply illegal. It’s confusing because the integer is defined somewhere else, yet it needs to be a valid GCRef at the point of use and you don’t want to allow GCRefs to live in integer values.

As with ptrtoint, this seems like it would be better as a separate intrinsic if some hypothetical frontend needs it.

If this is allowed, doesn’t the integer need to be a GCRoot so a simple analysis can determine the offset?

> 
> >
> >> ### Hoisting inttoptr is (generally) invalid:
> >>
> >>   if (<condition>) {
> >>     %val_gc = ptrtoint %val to GCREF
> >>   }
> >> ==>
> >>   %val_gc = ptrtoint %val to GCREF
> >>   if (<condition>) {}
> >>
> >> is invalid since we do not know that `%val` is a valid bitwise
> >> representation of a GCREF at the point in time we do the `ptrtoint`
> >> (i.e.<<condition>>  could be controlling whether `%val` is a valid
> >> bitwise representation of a GCREF).
> >>
> >> In theory this isn't different from loads, divisions or any other
> >> potentially trapping operation, but I think a lot of LLVM directly
> >> encodes the fact that `CastInst` s are obviously speculatable, so
> >> maybe Integer ->  GCREF casts should be outlawed at the type system
> >> level, and be done via an intrinsic?
> >
> 
> > Is this supposed to be 'inttoptr’? I don’t see how this can work at all, since the integer’s lifetime can extend beyond a safepoint.
> 
> Do you mean the initial example?  That is fine if <condition> is
> always false at runtime.

The title of the example says “inttoptr”…

> > Do you really need to support constant GCRefs? Cliff Click said it was a bad decision.
> 
> Typically we don't need inttoptr to GCREFs, but other runtimes might.
> I'm happy to not support it for now.

Again, this may be better served with a gcref_to_int intrinsic that cannot be hoisted.

> >> ### Load forwarding between integers and GCREFs is invalid
> >>
> >> %val_i = load i64, i64* %loc
> >> %val_p = load GCREF, GCREF* (bitcast %loc)
> >> ==>
> >> %val_i = load i64, i64* %loc
> >> %val_p = inttoptr %val_i to GCREF
> >>
> >> is invalid if *%loc contains a GC reference.  Given the model that we
> >> have, the first program loads some bitwise representation of
> >> fluctuating a GC reference, and then loads the same GC reference as a
> >> GCREF; whereas the second program tries to convert a potentially stale
> >> bitwise representation of a GC reference into a fluctuating GCREF
> >> (which is not allowed).
> >
> > I think the original code is unspecified, maybe undefined at the point of %val_i’s use.
> 
> I'd say that would depend on what use.  For instance, say we have:
> 
> class A {
>  long lField;  // Offset 16
> }
> 
> 
> class B {
>  Object oField;  // Offset 16
> }
> 
> void h(B b) {
>  Object o = b;
>  hC = b.oField.hashCode();
>  if (o instanceof A) {
>    print(((A)o).lField + 20);
>  }
> }
> 
> it is reasonable to optimize "h" to:
> 
> void h(i8* b) {
>  GCREF o = *(b + 16)
>  long l  = *(b + 16)
>  long l2 = l + 20;
>  if (is_instance_of(b, A_class)) {
>    print(l2)
>  }
> }
> 
> Above the use of l in "l + 20" has to be fine, and not UB.

Good example.

Of course, it could be just another instance of a “no inttoptr/ptrtoint” rule.

> >> ## Implementation Issues&  Impact to LLVM
> >>
> >> This is a proposed path forward to implementing GCREF in LLVM.  The
> >> caveat mentioned at the start of the email (everything is open for
> >> discussion) especially applies here.
> >>
> >> ### Changes to IR to represent GCREF
> >>
> >> We change `datalayout` to denote pointers of a specific address space
> >> as GCREFs.  We add verifier rules forbidding `inttoptr` conversion of
> >> integers into GCREFs and add an intrinsic (that can be control
> >> dependent) that converts integers to GCREFs.

This intrinsic sounds like what I’m suggesting, except that I’m suggesting that the frontend can hypothetically generate the intrinsic if needed, and rather than prohibiting `inttoptr/ptrtoint` conversion, you outright inhibit it’s existence, which is trivial to verify.

> > Sounds good to me. If you really need Int->GCRef conversion, then it seems like it needs to be an intrinsic with side effects, contrary to many of your examples.
> 
> Yes.  The other solution is to make `inttoptr` a side effect depending
> on the result type, but that sounds like a bad idea.

I think you want side effects on the Int->GCRef intrinsic (hoisting it is illegal, sinking it below calls is illegal), and don’t want to use inttoptr at all.

> > Or are you thinking that this intrinsic is only generated after lowering GCRef stores in step #4?
> 
> > What about store barriers? Do they need an intrinsic or just `ptrtoint %gcref to i64`?
> 
> As long as stores are correctly typed (i.e. a store of GCREF type has
> not been converted to a store of type i64, like instcombine does
> today), I think it is sufficient to late-rewrite stores into using
> a store barrier based on the `gc_ref_to_int` intrinsic that we'll add.
> 
> 
> >> ### Changes to the optimizer
> >>
> >> We change the optimizer to
> >>
> >> - Not speculate loads of GCREF type
> >> - Not change uses of GCREF types to uses of non-GCREF types
> >>
> >> Locally we already have changes fixing most of the places that'd need
> >> to be fixed for the above to hold.
> >
> > You may be answering this elsewhere in the thread, but are we sure the GC load shouldn’t be an intrinsic?
> 
> That's what others are leaning towards.  I'm somewhat hesitant to do
> that since it will involved changing almost every place that handles
> LoadInst to also handle the gc_load intrinsic (and would be a bad
> thing for the codebase long term), but I'm not 100% opposed to it if
> that's what we need to move forward.

I think this would be just another workaround for LLVM’s lack of dependence tracking on loads, which is not to say it’s wrong thing to do, but just becomes a matter of evaluating the engineering/maintenance burden.

> >> ### Lowering
> >>
> >> We do not teach LLC about GCREF's at all (in fact, we can change it to
> >> assert that the datalayout does not contain a `gc` directive).  We
> >> make the RewriteStatepointForGC pass rewrite the IR to make
> >> relocations explicit (so that no more GCREF-like semantics are
> >> necessary) and remove the GC directive from the Module's datalayout.
> >
> > Ok. Is datalayout normally supposed to be used to convey IR lowering semantics? Or should it be immutable per target?
> >
> > During statepoint lowering, I guess the IR does not change in any locally recognizable way. Pointer types are all still addresspace(1) right? Changing the datalayout to change the meaning of those types makes sense to me if it’s ok with everyone else.
> 
> I see what you're getting at here -- changing the datalayout does seem
> a little hacky.  Maybe there is an efficient way to remap types
> (without deleting and re-creating every instruction def'ing and using
> GCREFs)?

I really don’t know that the "right thing to do" is. I supposed if no one objects to changing datalayout then it’s the simplest option.

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160712/8afe564e/attachment-0001.html>