[llvm-dev] RFC: Strong GC References in LLVM

Thu Jul 14 16:48:23 PDT 2016

Hi all,

It looks like the key controversial point is the bit about the extra
control dependence on loads and stores[0].  Generally the consensus is
that (please chime if you think otherwise) it is not reasonable to
make the safety (or semantics) of a load instruction depend on the
type it is loading.  Introducing such a thing would involve changing
the IR semantics in a fundamental way, and therefore has a high bar
for acceptance.

Here is a summary of the alternative solutions that were proposed here
and on IRC (thanks Chandler, Andy, Eli!):

  1. Model loads and stores of GC references as intrinsic calls: add
     llvm.gc_load, llvm.gc_store intrinsics, and optimize them as loads
     and stores whenever appropriate and legal.

  2. Introduce a flag on load and stores that either
       a. Denotes a "gc_safety" control dependence.
       b. Denotes a "blackbox_safety" control dependence.  In this case
          we will probably have some optional metadata on loads and
          stores to indicate that the control dependence is actually on
          GC safety.

     As a starting point, LLVM will conservatively not speculate such
     loads and stores; and will leave open the potential to upstream
     logic that will have a more precise sense of when these loads and
     stores are safe to speculate.

  3. Introduce a general way to denote control dependence on loads and
     stores.  This can be helpful to LLVM in general, and will let us
     basically implement a more precise version of (2).

# Tradeoffs

(1) is the easiest to implement initially, but I think it will be bad
for LLVM in the long term -- every place that looks at loads and
stores will have to now look at this additional set of intrinsics.
This won't be terribly complex, but it will be noisy.

I personally like (2) the most.  It fits in well with the current
framework: since most (all?) load/store speculation has to do some
sort of safety check anyway we can fold the "gc_safety" or
"blackbox_safety" logic into those safety checks.  In practice maybe
we can even factor this into one of the `isSimple` or
`isUnordered` -like checks.

(3) is probably the cleanest solution, but I'm worried about its scope
-- given that this will be large investment, I'm worried I'll spin my
wheels on this for a long time and ultimately realize that it isn't
really the right fix.

Thoughts?
-- Sanjoy

[0]: I may have (incorrectly) mentioned otherwise on IRC, but we need
to model safety properties of stores as well, to avoid transforms
like:

   %x = malloc()  ;; known thread local
   if (cond_0) {
     store GCREF %val to %x
   }
   if (cond_1) {
     store i64 %val to %x
   }

to

   %x = malloc()  ;; known thread local
   if (cond_0 || cond_1) {
     store GCREF %val to %x  ;; This "speculative" store is bad
     if (cond_1) {
       store i64 %val to %x
     }
   }