[llvm-dev] RFC: Strong GC References in LLVM
Sanjoy Das via llvm-dev
llvm-dev at lists.llvm.org
Mon Jul 18 11:54:51 PDT 2016
I think it is time to start getting more concrete here. As a starting
point, I want to send out for review (roughly) the following changes:
- Add a "gc" address space to the datalayout string
- Start implementing the non-controversial rules (i.e. everything
except the bits that initiated the "nospeculate" attribute
- No pointer <-> integer casts for GC address spaces to begin with
- Add an intrinsic (with control dependence) to
convert GCrefs -> integers (we need this for GC load/store
- Disable some of the problematic "cast by round tripping through
memory" type optimizations for loads and stores that are of GC
The things above are things we know we need, and even if all we do is
implement those, we will be in a better position overall.
One thing I want a design opinion on (already discussed on IRC): I'm
planning to phrase RewriteStatepointsForGC (a ModulePass) that
"implements" GC references "in terms of" normal pointers. One way to
do this is to rewrite each def and user of GC refs to use a normal
pointer, but that's unnecessary data structure churn, so I was
wondering if instead we can flip the meaning of what a GC ref is by
modifying the datalayout instead? RewriteStatepointsForGC can then be
seen as changing IR that can be lowered to run on only a "machine"
that directly supports GC pointers to IR that can be lowered to run on
machines that don't. That is RewriteStatepointsForGC will change IR
"No explicit relocations, addrspace(k) is marked as 'gc' in the
datalayout" to "All relocations explicit, addrspace(k) is not marked
specially in the datalayout"
However, Chandler had some (strong?) reservations on IRC about
modifying datalayout in an optimization, in the face of which I have a
couple of alternatives:
- Have RewriteStatepointsForGC rewrite defs and users of GC
references to use a "normal" pointer type. I'm a little hesitant
to to do this since it seems wasteful (no evidence yet that it will
matter), and may complicate keeping side data structures correct in
the face of mass invalidations.
- Represent the gc address space in something other than the
datalayout that we all can agree is fair game to be modified by a
ModulePass. Not a great option since datalayout seems the most
natural place to put the "gc-ref-addrspace" information.
- Don't do anything, i.e. RewriteStatepointsForGC does what it does
today: it rewrites pointers of addrspace(1) (or addrspace(k) for
some k) to be explicit but does not change the meaning of
"addrspace(k)". I'm hesitant to do this because then I can't
concisely answer "what does RewriteStatepointsForGC do?".
I want to see what others think about this, but in the absence of any
specific opinion here I'll go with the first option (and consider
using mutateType if things turn out to be too slow).
In parallel with all this, I'll try to come up with a concrete notion
of how the nospeculate attributes on loads and function calls will
look like, how it would interact with optimizations like mem2reg etc.
I'll consider potential interactions with
https://reviews.llvm.org/D20116 "Add speculatable function attribute"
and generally just kick it around to see if the idea holds up and
gives us all of the constraints we need.
More information about the llvm-dev