<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Mon, Jul 11, 2016 at 3:44 PM Eli Friedman <<a href="mailto:eli.friedman@gmail.com">eli.friedman@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">On Mon, Jul 11, 2016 at 2:28 PM, Sanjoy Das <span dir="ltr"><<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpointers.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">ping!<br>
<br>
Sanjoy Das wrote:<br></blockquote></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="m_1813619088283792144h5">
# Proposed Solution:<br>
<br>
We introduce a "new" LLVM type. I will still refer to it as GCREF<br>
here, but it may actually still be "<ty> addrspace(k)*" where k is<br>
specially noted in the datalayout.<br>
<br>
Semantics:<br>
<br>
1. GCREF represents an equivalence class of values (equivalence<br>
relation being "points to a fixed semantic object"). The bitwise<br>
representation fluctuates constantly outside the compiler's<br>
control (the dual of `undef`), but may have invariants (in<br>
particular, we'd like to be able to specify alignment, nonnull<br>
etc.). At any given point in time all GCREF instances pointing to<br>
the same object have the same bitwise representation (we need this<br>
to make `icmp eq` is well-defined).<br>
<br>
2. GCREF instances can only contain a valid gc reference (otherwise<br>
they can't meaningfully "fluctuate" among the various possible<br>
bitwise representations of a reference).<br>
<br>
3. Converting GCREF to integers is fine in general, but you'll get an<br>
arbitrary "snapshot" of the bitwise value that will generally not<br>
be meaningful (unless you are colluding with the GC in<br>
implementation defined ways).<br>
<br>
4. Converting integers to GCREF is allowed only if source integer is<br>
a bitwise representation of a valid GC reference that is not an<br>
out of bounds derived reference. However, this is difficult for<br>
the compiler to infer since it typically will have no fundamental<br>
knowledge of what bitwise representation can be a valid GC<br>
reference.<br>
<br>
5. Operations that use a GCREF-typed value are "atomic" in using the<br>
bitwise representation, i.e., loading from a GCREF typed value<br>
does not "internally" convert the GCREF to a normal<br>
integer-pointer and then use the integer-pointer, since that would<br>
mean there is a window in which the integer-pointer can become<br>
stale[1].<br>
<br>
6. A GCREF stored to a location in the heap continues to fluctuate,<br>
and keeps itself in sync with the right bitwise representation.<br>
In a way, there isn't a large distinction between the GC and the<br>
heap -- the heap is part of (or managed by) the GC.<br>
<br>
I think (6) is the most controversial of the semantics above, but it<br>
isn't very different from how `undef` stored to the heap remains<br>
`undef` (i.e. a non-deterministic N-bit value) and a later load can<br>
recover `undef` instead of getting a normal N-bit value.<br></div></div></blockquote></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>I'm not really convinced that the GCREF type is really necessary... consider an alternate model:<br><br></div><div>1. A GCREF is never loaded into a register; it's either on the heap, or in an alloca.<br></div><div>2. Add an intrinsic gcref.copy which copies a gcref between two allocas.<br></div><div>3. Add intrinsics gcref.load_gcref(GCREF*, GCREF*, offset) and
gcref.store_gcref(GCREF*, GCREF*, offset, value) which load and store a gcref through a gcref.<br>4. Add intrinsics gcref.load_value(GCREF*, offset) and
gcref.store_value(GCREF*, offset, value) which load and store normal
values a gcref.<br></div><div>5. The statepoint lowering pass gets rid of the allocas.<br></div><div><br></div>Keeping GCREFs exclusively in memory means the LLVM optimizer will handle them conservatively, but correctly.<br><br></div><div class="gmail_quote">I guess the problem with this is precisely that the LLVM optimizer will handle them conservatively... but on the flip side, I think you're going to end up chasing down weird problems forever if a "load" from an alloca has side-effects.<br></div></div></div></blockquote><div><br></div><div>I think everything but this last weird aspect we already get from address spaces.</div><div><br></div><div>I misread the proposal originally and didn't understand that the problem was loading from an alloca *holding* the GC pointer, and thus it was a normal and boring load that somehow has to have side-effects.</div><div><br></div><div>I fundamentally think that we can't do that. I can see several ways to make the result work without that.</div><div><br></div><div>- Teach the statepoint rewriting to handle hoisted loads in some way (haven't thought too much about how feasible this is)</div><div>- Tell LLVM that the load has this weird control dependence with some mechanism (make it a special gc load intrinsic, or a volatile load, or ....)</div><div><br></div><div>-Chandler</div></div></div>