[PATCH] D104268: [ptr_provenance] Introduce optional ptr_provenance operand to load/store

Jeroen Dobbelaere via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 21 14:14:41 PST 2022


jeroen.dobbelaere added a comment.

Hi Ralph,

thanks for the feedback !

Greetings,

Jeroen



================
Comment at: llvm/docs/LangRef.rst:645-647
+provenance is decoupled from the actual pointer computation. As computations
+are not needed to track the origin of a pointer, those can be omitted for the
+``ptr_provenance`` operand. Dependencies on ``PHI`` and ``select`` instructions
----------------
RalfJung wrote:
> This is confusing, what does it mean that "computations can be omitted"? Computations *that do not alter provenance* can be omitted, sure -- for example `getelementptr`. But other operations could affect provenance and those are still relevant. (E.g., passing a pointer to a function as a `noalias` argument gives it a fresh distinct provenance, so such a computation/operation cannot be omitted even on the provenance path.)
> 
> Provenance is an inherent part of a value of pointer type, as far as correctness arguments for LLVM analyses and transformations go. Trying to treat it as any less real than the directly observable bits and bytes will only lead to trouble such as https://github.com/llvm/llvm-project/issues/34577.
> This is confusing, what does it mean that "computations can be omitted"? 

This should indeed be: `Computations that **do not change provenance** can be omitted.`


================
Comment at: llvm/docs/LangRef.rst:651-652
+
+Alias analysis can make use of both, the computed pointer value and the
+provenance to come up with alias conclusions.
+
----------------
RalfJung wrote:
> In particular this sounds just wrong, if `ptr_provenance` is merely sugar for a "combine address with provenance" intrinsic: if `ptr_provenance` is present, it would be a bug for alias analysis to make any conclusions based on the provenance of the pointer argument.
Hmm. The wording seems to allow a misinterpretation. You cannot choose if the 'provenance' is that of the pointer computation or if it is the decoupled one. If the provenance is decoupled, that is what you need to use.



================
Comment at: llvm/docs/LangRef.rst:9988-9991
+The optional ``ptr_provenance`` argument, when present, specifies a separate
+pointer provenance path for the ``pointer`` operand of the ``load`` instruction.
+See :ref:`Pointer Provenance<_ptr_provenance>`. When it is not present, the
+``pointer`` operand can be used for the pointer provenance.
----------------
RalfJung wrote:
> RalfJung wrote:
> > Rust and C also have a notion of "pointer provenance", but it is subtly difference: provenance in those languages is something that flows with pointer values. IOW, a value of pointer type consists of some address in memory, plus some "provenance" metadata. LLVM also has that kind of provenance, it is needed e.g. to explain some of the behavior of getelementptr. (A pointer returned by getelementptr without inbounds can go out-of-bounds but must not be used to access memory outside the bounds of the allocation it started with. In other words, the pointer "remembers" the allocation it is associated with in some way that is separate from the integer address it points to.)
> > 
> > Is it a good idea to use the same term for this slightly different concept? A load operation already receives provenance from its pointer argument, and now with this patchset it *also* receives something else, but also called provenance, via a separate argument.
> (replying to my own comment after realizing there was prior discussion on that topic here -- sorry, I am completely lost in all this "patch stack" business)
> 
> >  So the ptr_provenance is just an optimized representation of the intrinsic, that makes sense.
> 
> Okay so sounds like this is supposed to completely override the provenance that comes with the pointer itself? Does that mean that it is now legal to do something like 
> - use `getelementptr %ptr1` to compute some `%ptr1a` that is address-equal to some other `%ptr2`
> - `load %ptr1a ptr_provenance %ptr2_provenance`
> 
> This would usually be UB if `ptr2` points to a different allocation, since ptr1a is "based on" the wrong allocation.
> 
> In other words, when `ptr_provenance` is present, only the address of the pointer argument matters, and for everything else (including things like which allocation this pointer was "derived from"), only `ptr_provenance` matters?
> In other words, when `ptr_provenance` is present, only the address of the pointer argument matters, and for everything else (including things like which allocation this pointer was "derived from"), only `ptr_provenance` matters?
Yes. That is indeed the case.

Note that this is an extension from the original goal of the `ptr_provenance`. The original goal was to provide a path where restrict annotations were added so that those would not clobber the pointer value. In that world, the pointer value and `ptr_provenance` would come together again at some point. The extension allows to completely separate the pointer value and the `ptr_provenance`. This provides a solution to handle this kind of optimizations in a correct way.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104268/new/

https://reviews.llvm.org/D104268



More information about the llvm-commits mailing list