[llvm-dev] [GC / Statepoints] Collector supports only base pointers as stack roots

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Mon Jan 4 14:38:28 PST 2016



On 01/04/2016 02:30 PM, Manuel Jacob wrote:
> On 2016-01-04 18:27, Philip Reames wrote:
>> Fundamentally, the optimizer expects to be able to introduce derived
>> pointers (both interior and exterior) in the form of GEPs.  This is
>> unavoidable.
> Sure.
>
>> If you collector is non-relocating, you can simply ignore the derived
>> portion of the relocation record.  We report both the base pointer and
>> the derived pointer for each derived pointer.  If you take all the
>> reported base pointers, then unique the set, you should have all of
>> the objects you need to report as roots.
> Sure.
>
>> If your collector is relocating, you'll need to give more information.
>>  In particular, I'm not sure you how you can implement a relocating
>> collector which doesn't know about derived pointers without some
>> really messy hacks.
> Can you explain how a collector could possibly handle derived 
> pointers?  In the meantime I found a solution, but I'm not sure it 
> works in the general case and whether it qualifies as a "really messy 
> hack": The collector, for each base, relocates the base pointer, 
> computes an offset between the old and the new address, and applies 
> that offset to all derived pointers from this base.
I feel like we may be talking past each other here.  Let me start fresh.

To handle derived pointers, a GC needs to update the derived pointer to 
point to the same offset within the relocated object.  Some if the 
object is originally at address X and is relocated to address Y, a 
derived pointer which is X+C needs to become Y+C.  Different derived 
pointers within the same allocations may have distinct values of 'C' 
(i.e. their offset from the base of the object).

(If I read your comment right, this is exactly what you've implemented, 
just phrased differently.)

If the GC itself doesn't know how to do this, the compiler can chose to 
essentially do the computation of C explicitly and rematerialize the 
derived value after the safepoint.  This isn't great, but it is 
correct.  So, you'd end up with something along the lines of:
%der1 = gep %X, %C
%offset = (uint64_t)%der1-(uint64_t)%X // which happens to constant fold 
to C in this case
(%Y) = safepoint(%X)
%der2 = gep %Y, %offset
use(%der2)

We heuristically chose to remateralize derived pointers with small 
constant offsets.  Extending the remat code to handle all derived 
pointers was what I meant by a "messy hack".
>
> -Manuel
>
>> Philip
>>
>> On 12/30/2015 06:51 PM, Manuel Jacob wrote:
>>> Hi,
>>>
>>> My collector supports only base pointers as stack roots.  This 
>>> wasn't a problem until I tried to run some optimizations before 
>>> RS4GC, which introduced (interior) derived pointers. The statepoint 
>>> documentation mentions that these collectors exist, but doesn't 
>>> mention whether and how this is currently supported.  What could I 
>>> do to make it work?
>>>
>>> -Manuel



More information about the llvm-dev mailing list