[PATCH] Statepoint infrastructure for garbage collection

Thu Oct 16 14:51:58 PDT 2014

On 10/15/2014 02:52 PM, Philip Reames wrote:
> Kevin,
>
> Let me try to answer the point you're getting at.  In doing so, I want 
> to explicitly separate the statepoint intrinsics which are currently 
> up for review, and the future late safepoint placement. The statepoint 
> intrinsics have value separate from the late safepoint placement 
> approach, and I want to justify them on their own merits.
>
> The basic problem we're trying to solve with these intrinsics is 
> supporting fully relocating collectors.  By definition, such a 
> collector needs to be precise w.r.t. root tracking.  Even worse, we 
> need to ensure that *all copies* of a pointer are updated.  It is not 
> acceptable to make two copies of a pointer, update one of them, then 
> use the other for a memory access.
>
> If the compiler is allowed to introduce derived pointers (i.e. pointer 
> valued temporaries created by the compiler which point somewhere 
> within an object, or outside it, but associated with it), we also need 
> to track which *object* each *pointer* to be updated is associated 
> with.  This is required to safely update the pointers.
>
> For the sake of argument, let's say our frontend does safepoint 
> insertion.
>
> There's a couple of approaches which seem like they might work, let's 
> explore each in turn:
> - We could use patchpoints to record all the values needed for the GC 
> stack map.  This mostly works, but requires that the patchpoint not be 
> marked readonly or readnone (to prevent illegal reorderings).  That 
> could be a usage convention.  The real problem is that the compiler is 
> still free to introduce multiple *copies* of an SSA value over the 
> patchpoint.  (This is completely legal under SSA semantics.)  When it 
> does so, it creates a situation where the gc could fail to update a 
> pointer which will then be dereferenced. That's a bug.  Worth stating 
> explicitly, I believe the patchpoint scheme would be sufficient *if 
> you do not every relocate a root*.
> - We could use the gc.root.  gc.root defines the allocs, but does not 
> define the call format, or any of the mechanisms to ensure proper 
> relocation.  As such, it *by itself* is not viable.  Also, gc.root 
> inherently assumes every value will have a stack slot. Without *heavy* 
> reengineering, there's no way to have a gc pointer in a callee saved 
> register over a call site.  This is an unfortunate limitation.  Any 
> call representation without explicit relocation suffers from the same 
> bug as the patchpoint scheme.
> - We could combine gc.root allocas and patchpoints.  This essentially 
> combines the flaws (no gc pointers in callee saved registers over 
> calls, and missed copies), with no benefit.
>
> The statepoint intrinsics are basically the patchpoint option above, 
> but with relocation made explicit in the IR.  While it's still legal 
> for the optimizer to create a copy of the value feeding a statepoint, 
> that's now okay.  By construction, there can be no use of the original 
> SSA value (and thus the copy) after the statepoint. Instead, the 
> explicitly relocated value is used.
>
> To summarize: We need (something like) statepoints for correctness of 
> fully relocating collectors.
>
> (The points I'm making here are somewhat subtle.  If it would help to 
> have IR examples here, ask.  I'm deferring writing them because it's 
> time consuming.)
I need to withdraw this part of my comments.  After further reflection 
and discussion offline, I was reminded that you can implement full 
relocation semantics with gcroot.  The parts about patchpoints stands, 
but the gcroot comments are inaccurate.

I need to leave early today, but I plan to respond tomorrow with a more 
complete analysis of the tradeoffs between gcroots and statepoints.  
Sorry for the confusion.
>
>
> Other advantages of the statepoint approach:
>
> The gc.relocate intrinsics (part of the statepoint proposal) also 
> makes it explicit in the IR what the base object of each pointer to be 
> relocated is.  This isn't *required* (you could encode the same 
> information in the arguments of the statepoint), but making it 
> explicit is much cleaner.
>
> The explicit relocation notation has the potential to be extended in 
> to the backend.  With some register allocator changes (not part of 
> this patch!), we could support gc pointers in callee saved registers.  
> This is possible with the (incorrect) patchpoint scheme.  It is 
> possible, but *hard*, with the gc.root scheme.
>
> The posted patch includes a couple of small optimizations (i.e. null 
> forwarding) that help performance, but could (probably) be implemented 
> on top of another scheme.  We have a number of planned optimizations 
> on the statepoint mechanism.
>
>
> Now, let me finally bring up late safepoint placement. The only real 
> impact on this patch is that, to date, we have only focused on the 
> *correctness* of a statepoint passing through the optimizer.  We have 
> not attempted to teach the optimizer about how to leverage one or 
> perform optimizations over one.  There's room for improvement here 
> (i.e. not completely blocking inlining), but we prefer to approach 
> this problem by simply inserting them late.   You could instead choose 
> to insert them at generation time, and teach the optimizer about their 
> semantics.  That *strategy choice* is independent of the 
> representation choosen provided that representation is *correct*.
>
> Yours,
> Philip
>
> On 10/14/2014 07:01 PM, Kevin Modzelewski wrote:
>> I think a change like this might be more compelling if you could give 
>> more detail on how it would actually help (I can't find the detail 
>> I'm looking for in your blog posts).  It seems like the value of this 
>> patch is that it will work with late safepoint placement, but it'd be 
>> nice to see some examples of cases where late safepoint placement 
>> gives you something that early safepoint placement (ie by the 
>> frontend) doesn't.  It kind of feels like either approach will work 
>> well with only non-gc values, and neither approach will be able to do 
>> much optimization when you do function calls.  I'm not trying to 
>> claim that that's necessarily true, but it'd be easier to understand 
>> your point if there was some example IR.
>>
>> http://reviews.llvm.org/D5683
>>
>>
>