[llvm-dev] RFC: alloca -- specify address space for allocation

Fri Aug 28 13:50:08 PDT 2015

Inline

> -----Original Message-----
> From: Philip Reames [mailto:listmail at philipreames.com]
> Sent: Friday, August 28, 2015 9:38 AM
> To: Swaroop Sridhar <Swaroop.Sridhar at microsoft.com>; llvm-dev <llvm-
> dev at lists.llvm.org>; Sanjoy Das <sanjoy at playingwithpointers.com>
> Cc: Joseph Tremoulet <jotrem at microsoft.com>; Andy Ayers
> <andya at microsoft.com>; Russell Hadley <rhadley at microsoft.com>
> Subject: Re: RFC: alloca -- specify address space for allocation
> 
> On 08/27/2015 04:24 PM, Swaroop Sridhar wrote:
> > Inline.
> >
> > From: Philip Reames [mailto:listmail at philipreames.com]
> > Sent: Thursday, August 27, 2015 11:01 AM
> > To: Swaroop Sridhar <Swaroop.Sridhar at microsoft.com>; llvm-dev
> > <llvm-dev at lists.llvm.org>; Sanjoy Das <sanjoy at playingwithpointers.com>
> > Cc: Joseph Tremoulet <jotrem at microsoft.com>; Andy Ayers
> > <andya at microsoft.com>; Russell Hadley <rhadley at microsoft.com>
> > Subject: Re: RFC: alloca -- specify address space for allocation
> >
> > *trimmed for length*
> >> I'm not directly opposed to this proposal, but I'm not really in support of it
> either.
> >> I think there a number of smaller engineering changes which could be
> made to RewriteStatepointsForGC to address this issue.
> >> I am not convinced we need to allow addrspace on allocas to solve that
> problem.
> >> More generally, I'm a bit bothered by how your asserting that a pointer to
> a stack based object is the same as a managed pointer into the heap.
> >> They share some properties - the GC needs to know about them and
> mark through them - but they're also moderately different as well - stack
> based
> >>   objects do not get relocated, heap ones do.  Given this differences, it's
> not even entirely clear to me that these two classes of pointers should be
> treated the same.
> >> In particular, I don't see why RewriteStatepointsForGC needs to insert
> explicit relocations for stack based objects.
> >> That makes no sense to me.
> > Yes pointers to the stack are different in that they are not relocated or
> collected.
> > However, CLR's "managed pointers" are defined as a super-type of
> > pointers-to-the GC-heap and pointers-to-unmanaged-memory.
> > These managed pointers should be reported to the GC. They will be
> > updated during a collection relocated if the referenced object is relocated.
>>
>
> Rather than explaining what your expectations are, can you explain why?
> For example, I'm assuming here that you need to report stack based objects
> exclusively for determining liveness of objects they might contain pointers to
> (i.e. identifying roots).  I'm assuming that the actual stack allocation does not
> get any special magic handling by the GC other than marking through it.
> Correct?

I'm sure about what explanation you're asking. But I think there are a few different aspects here:
(1) GC-Pointers living in the stack: All pointers to heap objects (including pointers to the middle of objects) held in stack slots and in registers must be reported for seeding liveness computation. This part is not CLR specific.
(2) Pointers to stack slots: Pointers to stack locations that are typed as (gc) managed-pointers need not be reported, because the GC does not handle them.
(3) Managed pointers (which could point to the heap or elsewhere) must be reported if we cannot definitively establish that don't point to the heap.

> > Wrt the RewriteStatepointsForGC phase:
> > If we know that a particular managed pointer is definitely coming from an
> alloca, it is OK to not report it, and not insert relocations.
> > However, sometimes we cannot tell this unambiguously, if a managed
> pointer comes from a PHI(alloca pointer, heap pointer).
> This is a sound point.  So the conservatively correct thing to do is to relocate
> all SSA pointers which are either heap objects or stack objects (e.g. all
> managed pointers in your terms).  

Yes relocating and reporting all managed-pointers will be conservatively correct.

> An inference pass which proved that a
> particular SSA value was based on an alloca (address of a stack object) would
> not need to be relocated, but would need to be tracked for liveness
> purposes?  Right now, we really don't have that distinction (live vs needing
> relocation) within RewriteStatepointsForGC.  Do we need to add it?

If the inference pass can prove that a particular SSA value comes from an alloca, 
we can skip reporting and relocating the alloca pointer -- as an optimization.
As an orthogonal issue -- if the stack location (pointed to) contains any gc-pointers within it, 
then those locations will still need to be considered for reporting/relocation.

> (Also, I've been expecting to see patches from you fixing bugs in RSForGC
> around alloca handling.  Your using it in ways I never designed for, so I'm a bit
> surprised not to have seen these.  Have you audited the code and tested the
> cases you care about?  Having concrete test cases in tree would make a lot of
> this discussion easier.)

RSForGC phase tracks the liveness of alloca pointers. 

For example:
  %loc0 = alloca i8
  %0 = addrspacecast i8* %loc0 to i8 addrspace(1)*, !dbg !10
  %safepoint_token = call i32 (i64, i32, void ()*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()*@CORINFO_HELP_POLL_GC::JitHelper, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %0)

I had to make some changes in the StatepointLowering to work around some special handling of alloca pointers.
I did not post it for review yet, to get some more test coverage (with more MSIL functions). 
But I can certainly checkin the test cases (and also prepare the change for review) if that makes discussion easier.

Swaroop.