[PATCH] D104547: [langref] attempt to clarify semantics of inttoptr/ptrtoint for non-integral types

Mon Jun 28 16:27:03 PDT 2021

reames added a comment.

In D104547#2836163 <https://reviews.llvm.org/D104547#2836163>, @loladiro wrote:

> So, I missed the original discussion of this, but I'm somewhat concerned about this direction. We very heavily rely on non-integral pointers in Julia. I'm concerned that removing the verifier rules will allow non-sound inttoptr/ptrtoint transformations to sneak in undetected. In general, I'm ok with the semantics proposed in this revision, but I would be much happier if they were disallowed entirely. Perhaps it is time to allow specifying more extensive address space attributes in the IR. For example, there is the constant discussion of whether geps may be commuted across addrspacecasts, which is just very end-user dependent. And now we have this possible distinction between "hard" and "soft" non-integral pointers. I think just letting frontends specify more precise semantics for their non-integral pointers may help alleviate them being pulled in so many different directions. Lastly, this may not be for the revision, but I take it the main motivation for this change is to be able to compute offsets between pointers to the same underlying object? I'm sympathetic to that use case, but could we just have a version of `sub` that did that directly? These offsets are well defined and stable, whereas allowing ptrtoint to give incompletely defined volatile answers seems very likely to wreak havoc.

I wish we had well defined semantics for address space casts.  If we did, we could achieve reasonable well define semantics for NI conversion with addrspacecast + ptrtoint of integral pointer.  We don't.  Worse, I see little evidence of progress in that direction.

On the pointer subtraction question, I'm increasingly seeing that as a valueable IR construct for optimization quality reasons.  It doesn't really solve anything for NI types though, so this is mostly a red herring.

The two root issues which caused me to relax the rules were:

1. Every actual user I know of - with maybe you as an exception - had removed the verfier rules.  As such, we were specifying a language that literally no one used or tested.
2. Dead code.  The optimizer is generally free to emit arbitrarily broken code along dynamically dead paths.  As a specific example, if a non-integral pointer's base is proven null (which is zero, even for NI pointers), we can have a NI pointer with a provably integral value.  This can happen both directly, and indirectly via LIV and constant ranges.  I don't remember the exact test case, but we did see real cases where the optimizer was exploiting undef in a way which produced ptrtoint along a dead path which was really hard to argue we shouldn't.

I'm happy to hear proposals for other approaches, but I really don't see any which are reasonably pragmatic.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104547/new/

https://reviews.llvm.org/D104547