[PATCH] D103492: [RS4GC] Treat inttoptr as base pointer

Philip Reames via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 4 14:35:46 PDT 2021


reames added a comment.

In D103492#2799986 <https://reviews.llvm.org/D103492#2799986>, @tolziplohu wrote:

> I think RS4GC should make sure that the address space it uses has been specified non-integral and give an error if it hasn't, as this seems like an easy trap to fall into.

Good idea, post a patch and it'd be a quick LGTM.

>> This is a really really hard design problem. I'm happy to discuss this offline, and I'm open to having a rule for handling inttoptr with explicit calls outs on the UB, but I'm worried you are misunderstanding the problem and are going to find yourself down a much deeper rabbit hole than you expect.
>
> It is possible I'm misunderstanding, so here's how I see the problem: converting a GC pointer to an integer and then to a pointer again is UB and bad, because the collector can't see it in between. However, many languages require pretending an integer is a pointer. In these cases, it's not actually a valid pointer, and it's not a problem if the GC can't see it; the problem is, it's hard to make that work without accidentally making it possible to hide a real pointer as an integer. I see three main options for dealing with this:
>
> 1. Do nothing, leave it as-is. In this case, the way to do something like this is to hide the `inttoptr` along with an `addrspacecast` to the non-integral address space in a `noinline` function without GC. This works, but is bad for optimization and performance in general, since an `inttoptr` should ideally be a no-op.
> 2. Allow `inttoptr` (and `addrspacecast`, which is currently allowed in release builds but not debug builds) in RS4GC, with a note that accessing memory through a pointer created by `inttoptr` is invalid (a linter to catch simple cases of that might be a good idea?). This is ideal for optimization and performance, since it lowers to a no-op and many optimization passes can see through it, but it's easier for frontend authors to miss and create invalid casts.
> 3. Add conversion intrinsics, something like `declare i8 addrspace(1)* @llvm.gc.inttoptr(i64)`. This is harder for frontend authors to screw up, since to find the intrinsic they'll be looking at the docs which will specify that the returned pointer isn't a valid pointer. From a performance perspective, it's in between the other two options, because while it lowers to a no-op, it's opaque to optimizations.
>
> The third option sounds most attractive to me, now that I better understand the risks of the second option. Does my evaluation of the problem sound accurate?

All of this depends a lot on the language.

A few quick examples:

- You could spell a "pin" operator which prevents an object from being moved, and an "unpin" that releases.  Within that dynamic scope, it's fine to loose copies of the pointer, provided at least one isn't lost.  In this world, the conversion operator only exists from reference to integer, not the other way around.  Such a conversion can be hoisted no further than the containing pin operation.
- For that same language with a non-relocating GC, you can let the conversion float above the pin.
- A language could allow a "create object from raw memory" mechanism, with all the invariants about a) having valid object state, and b) having the creation object being a *move* of the raw pointer.  (e.g. the raw pointer can't survive the creation.)  In this model, we'd need the optimizer never to look back through the conversion operator.  (e.g. a transform which did convert(p)->f to p->f would be wrong.)

I have yet to find/come up with a sufficiently general model to describe all reasonable variants.  At the moment, I'd suggest your variant (1), but as you note, that impacts optimization.  Are there concrete examples you're trying to optimize?

Warning: I'm really bad at responding promptly to long design discussions in email.  I strongly advise setting up a time to talk offline.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103492/new/

https://reviews.llvm.org/D103492



More information about the llvm-commits mailing list