[LLVMdev] Code for late safepoint placement available

Tue Jun 10 11:15:14 PDT 2014

On Thu, Jun 5, 2014 at 9:45 AM, Philip Reames <listmail at philipreames.com>
wrote:

> Thanks for the comments and for taking a look.
>
>
> On 06/05/2014 02:19 AM, David Chisnall wrote:
>
>> Hi Philip,
>>
>> The first thing that I notice on looking at the code is the lack of
>> comments.  For example, about the only comment that I see in include
>> /llvm/IR/Statepoint.h is a note telling me that a class is only intended to
>> be used on the stack.  Doxygen comments and, for a feature like this, some
>> high-level overview of how it's intended to be used are essential.
>>
> Agreed.  This is mandatory before upstreaming.  We've gone through a
> couple of implementations and before sharing I tried to go through and rip
> out the actively misleading comments.  Unfortunately, that didn't leave
> much left.  :(
>
>  We currently use address spaces on our architecture to differentiate
>> between traditional pointers and fat pointers (both of which are supported
>> on our system in hardware).  It would be nice, rather than hard-coding an
>> address space, to have a property on DataLayout that would tell you if a
>> specific address space contained GC pointers.  We could then designate an
>> address space that was both GCable and contained fat pointers, and another
>> for non-GCable fat pointers.
>>
> Making this configurable is definitely desirable.  I like your idea of
> including a list of address spaces in data layout which contain GC pointers
> and making that query-able.  That sounds quite reasonable.  Unless anyone
> makes another proposal, I'll run with this.

Does it need to be a list, or could it be a range? It the set of GC address
spaces was contiguous over some range of integers, then you'd only need to
store a min / max value, which might make the implementation simpler.

>
>
>
>> The tests don't look to be in a sensible format for LLVM - in particular,
>> LLVM tests should not depend on clang.  It would be better, I think, to
>> commit the IR generated from your Python script than the script itself
>> (although putting it in utils might be helpful for people to regenerate
>> them).
>>
> By the point we upstream this, those particular tests should probably just
> be removed.  When we were iterating on possible designs, it was useful not
> having the tests in IR form.  At this point, those tests have lost most of
> their value.  Creating a set of IR tests which use address spaces properly
> and cover all the same corner cases is probably the right approach.
>
> Philip
>
>
>> On 4 Jun 2014, at 17:35, Philip Reames <listmail at philipreames.com> wrote:
>>
>>  As I've mentioned on the mailing list a couple of times over the last
>>> few months, we've been working on an approach for supporting precise fully
>>> relocating garbage collection in LLVM.  I am happy to announce that we now
>>> have a version of the code available for public view and discussion.
>>>
>>> https://github.com/AzulSystems/llvm-late-safepoint-placement
>>>
>>> Our goal is to eventually see this merged into the LLVM tree.  There's a
>>> fair amount of cleanup that needs to happen before that point, but we are
>>> actively working towards that eventual goal.
>>>
>>> Please note that there are a couple of known issues with the current
>>> version (see the README).  This is best considered a proof of concept
>>> implementation and is not yet ready for production use.  We will be
>>> addressing the remaining issues over the next few weeks and will be sharing
>>> updates as they occur.
>>>
>>> In the meantime, I'd like to get the discussion started on how these
>>> changes will eventually land in tree.  Part of the reason for sharing the
>>> code in an early state is to be able to build a history of working in the
>>> open, and to to able to merge minor fixes into the main LLVM repository
>>> before trying to upstream the core changes.  We are aware this is a fairly
>>> major change set and are happy to work within the community process in that
>>> regard.
>>>
>>> I've included a list of specific questions I know we'd like to get
>>> feedback on, but general comments or questions are also very welcome.
>>>
>>> Open Topics:
>>>         • How should we factor the core GC support for review?  Our
>>> current intent is to separate logically distinct pieces, and share each
>>> layer one at a time.  (e.g. first infrastructure enhancements, then
>>> intrinsics and codegen support, then verifiers, then safepoint insertion
>>> passes)  Is this the right approach?
>>>         • How configurable does the GC support need to be for inclusion
>>> in LLVM?  Currently, we expect the frontend to mark GC pointers using
>>> address spaces.  Do we need to support alternate mechanisms?  If so, what
>>> interface should this take?
>>>         • How should we approach removing the existing partial support
>>> for garbage collection? (gcroot)  Do we want to support both going forward?
>>>  Do we need to provide a forward migration path in bitcode?  Given the
>>> usage is generally though MCJIT, we would prefer we simply deprecate the
>>> existing gcroot support and target it for complete removal a couple of
>>> releases down the road..
>>>         • What programmatic interface should we present at the IR level
>>> and where should it live?  We're moving towards a CallSite like interface
>>> for statepoints, gc_relocates, and gc_results call sites.  Is this the
>>> right approach?  If so, should it live in the IR subtree, or Support?
>>>  (Note: The current code is only about 40% migrated to the new interface.)
>>>         • To support invokable calls with safepoints, we need to make
>>> the statepoint intrinsic invokable.  This is new for intrinsics in LLVM.
>>>  Is there any reason that InvokeInst must be a subclass of CallInst?
>>> (rather than a view over either calls or invokes like CallSite)  Would
>>> changes to support invokable intrinsics be accepted upstream?  Alternate
>>> approaches are welcome.
>>>         • Is the concept of an abstract VM state something LLVM should
>>> know about?  If so, how should it be represented?  We're actively exploring
>>> this topic, but don't have strong opinions on the topic yet.
>>>         • Our statepoint shares a lot in the way of implementation and
>>> semantics with patchpoint and stackmap.  Is it better to submit new
>>> intrinsics, or try to identify a single intrinsic which could represent
>>> both?  Our current feeling is to keep them separate semantically, but share
>>> implementation where possible.
>>>
>>> Yours,
>>> Philip (& team)
>>>
>>> p.s. Sanjoy, one of my co-workers,  will be helping to answer questions
>>> as they arise.
>>>
>>> p.p.s. For those wondering why the current gcroot mechanism isn't
>>> sufficient, I covered that in a previous blog post:
>>> [1] http://www.philipreames.com/Blog/2014/02/21/why-not-use-gcroot/
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140610/a4c83f15/attachment.html>