[llvm-dev] RFC: Strong GC References in LLVM

Mon Jul 11 14:46:06 PDT 2016

My high-level comment is that I'm not really sure we need a new type here.
I'm curious whether we can make non-zero-address-space pointers have the
semantics you need in a conservative model.

The reason I say this is that the requirements you have here actually are
very similar to what I would expect other users of custom address spaces to
want. While they (and you probably) would have domain *specific*
optimizations you would like to perform, the generic optimizer needs to
keep its hands off to avoid breaking invariants. Specifically looking at
your examples:

On Fri, Jun 24, 2016 at 12:23 AM Sanjoy Das via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> # Examples of Bad Transforms
>
> ## Integer -> GCREF Conversion
>
> (%value is unused)
>
> %value = load i64, i64* %ptr
> == Transformed to ==>
> %value = load GCREF, GCREF* (bitcast %ptr)
>

I think this should definitely be invalid for non-zero address spaces.
Getting a non-zero address space out-of-thin-air is nuts (provided its not
a CSE-based transform that proves equivalence as you indicate in your more
detailed writeup).

> %value = load i64, i64* %ptr
> == Transformed to ==>
> %value = load i64, i64* %ptr
> %value_gc = inttoptr %value to GCREF  (%value_gc unused)
>

And here as well. We shouldn't (IMO) be adding out-of-thin-air address
spaces, and definitely not inttoptr operations on them!!!

> ## GCREF -> Integer conversion
>

I thought non-zero address spaces already made this flat out impossible?
THis is really not OK for many users of address spaces...

> ## Round tripping GCREF's through Integers
>

Same as above. Many non-zero address spaces don't have integer
representations that are meaningful I thought? This doesn't seem an
unreasonable requirement to me honestly...

> ## Bad types due to hoisting loads out of control flow
>
> This is bad
>
> if (is_reference) {
>   %val = load GCREF, GCREF* %ptr
> }
> == Transformed to ==>
> %val = load GCREF, GCREF* %ptr
> if (is_reference) {
> }
>
> unless the compiler can prove that %val is a GC reference at its new
> location.  Downstream we model the Java type system in LLVM IR to try
> to prove that %val is a GC reference in its new location, but for
> starters we will have to be conservative upstream.
>

I think this is the interesting thing. It is essentially saying that loads
of a non-zero address space pointer are control dependent which I don't
think is necessarily true with our current definitions of address spaces,
but I think this might be a very reasonable desire.

If anything, we might want something in the datalayout that identifies
whether particular address spaces can be speculatively loaded from. This is
not terribly dissimilar from the memory scopes proposal which essentially
wants to provide some encoding of basic transformations to non-zero address
spaces that are allowed while keeping everything else conservative by
default.

Personally, I'd be happy to make the default even more conservative to
address the GC use case and add a mechanism to opt out of that for GPUs and
other users that want very generic transforms to continue applying.

So I'm curious whether this seems reasonable to you (and others). To
summarize: make address spaces sufficiently conservative by default to
satisfy the requirements for GC pointers, and add "opt-in" mechanisms for
generic transforms that existing users of non-zero address spaces actually
desire.

Alternatively, I'm interested in any examples that more firmly illustrate
the reasons why address spaces are fundamentally the wrong *model* for GC
pointers.

-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160711/d361745b/attachment-0001.html>