[LLVMdev] Improving Garbage Collection

Wed Jul 20 12:39:35 PDT 2011

On Wed, Jul 20, 2011 at 11:20 AM, Andrew Trick <atrick at apple.com> wrote:

> On Jul 19, 2011, at 10:06 AM, Peter Lawrence wrote:
>
> Talin,
>           how about having the front-end generate an llvm.safe.point()
> intrinsic call at
> the desired safe points,  and having the addresses of the GC roots (at that
> point,
> can vary from call to call) be the parameters (with noescape attribute) to
> the intrinsic,
>
> IIUC currently the GC roots are tagged, and all analysis and transform
> optimizations
> have to special case these tagged objects, but if instead their addresses
> were taken at
> the safe points no special casing would have to be done -- all analysis and
> transform
> optimizations already know how to deal with objects whose address is taken,
>
> and since llvm does already have a "noescape" (not sure that's the correct
> name?)
> attribute for parameters, these addresses won't be misinterpreted by any
> alias
> analysis either, and llvm is free to go ahead and keep these values in
> registers
> between safe points  -- you can stop asking how to allow GC roots as SSA
> values,
> any traditional load-store optimization pass will do it for you for free.
>
> *** without you having to insert explicit load and store instructions, and
> having to
> somehow mark them as non-delete-able, or always omit optimization passes
> that
> you would otherwise like to have enabled ***
>
> and also without you having to store NULL into a safe point to end its
> lifetime.
> and I would suggest eliminating the gcroot() intrinsic as it's information
> content
> would be redundant.
>
>
> thoughts, comments ???
>
>
> -Peter Lawrence.
>
>
> It is helpful to think of the stack/register map generation in two
> phases: before and after identifying the location of heap
> pointers. Say PrePtrMap and PostPtrMap. We don't need to know stack
> offsets and register names at that point, but do we need a 1-1 mapping
> from pointer values to identifiable physical locations. I deliberately
> avoid calling these values roots here, because we may have multiple
> live pointers derived from an object and multiple copies of the same
> pointer. GC only needs to see one of these values to trace roots, but
> each of these still needs its own entry in the stack/register map for
> a moving collector.
>
> PrePtrMap, we need a type system to precisely tag all values known to
> contain valid heap pointers. No potentially uninitialized/undefined
> values allowed here. IntPtr/PtrInt casts become control dependent.
>
> PostPtrMap, we need an IR that represents mapped pointers as live-in
> to safepoints, *and* safepoints need to be defined as clobbering all
> locations that may contain a pointer. For example, if we have a
> register map, then we can no longer move and add instruction across a
> call if it may operate on a pointer. Obviously, it's easier to avoid
> invalidating a stack map, but fundamentally the same problem. No
> amount of spilling can bail you out without updating the map.
>
> The current gcroot solution works around the lack of LLVM type system
> support for heap pointers by effectively mapping pointers very early
> (in the front end), reloading roots after safepoints (so we only need
> one map entry per root), and relying on the rules that allows callees
> to write their caller's stack under certain circumstances (someone
> needs to explain these rules to me--is it only possible when an alloca
> pointer is taken?).
>
> Your proposal is exactly the same in terms of when the heap pointers
> are identified. That leaves all of the LLVM optimizer and codegen
> running in PostPtrMap mode. The problem is that LLVM is free to make
> copies of pointers and optimize across call sites without knowing how to
> update the map.
>
> The most efficient way to support GC is to move identification of
> pointer locations as late as possible. Optimization across safepoints
> needs to be effectively disabled after that point.
>
> -Andy
>
>
Achievement unlocked: People who are smarter than me and more knowledgeable
about the LLVM backends are arguing over the details of how GC ought to work
:)

> On Jul 18, 2011, at 10:53 PM, Talin wrote:
>
> On Mon, Jul 18, 2011 at 11:00 AM, Peter Lawrence <
> peterl95124 at sbcglobal.net> wrote:
>
>> Talin,
>>           do you identify safe-points in the current or proposed llvm
>> scheme, and if so how,
>> or are they implicit as being at all call sites (which begs the question
>> what about leaves
>> in the call tree, how does GC get started at all in that case).
>>
>> The LLVM linker has a feature where you can specify what kind of safe
> points your collector requires - the options are Loop, Return, PreCall and
> PostCall. You can also override this behavior and examine each instruction
> and return a boolean indicating whether it is or isn't a safe point.
>
> Currently I only have function calls as safe points, although I may
> eventually enable loops as well. As far as leaf functions go, consider that
> the call to allocate memory is also a safe point - and if a function doesn't
> allocate any memory then we don't care if the GC is involved or not.
>
> One complication with the current scheme is that the frontend has to have a
> sense of where the safe points are going to be. Because the current scheme
> requires the frontend to insert additional loads and stores around safe
> points (for spilling register values to memory so they can be traced), the
> frontend has to be able to guess which function call might be a safe point -
> but it can't know for sure due to the fact that optimization and inlining
> (which happens much later) may cause the removal of the actual call
> instruction. The safe but inefficient approach is to insert the extra loads
> and stores around every call instruction.
>
> Peter Lawrence.
>>
>
> --
> -- Talin
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>

-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110720/0f2a24a1/attachment.html>