[llvm] [IR] Introduce captures attribute (PR #116990)

Alexander Richardson via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 21 08:02:14 PST 2024


================
@@ -3318,10 +3348,104 @@ Pointer Capture
 ---------------
 
 Given a function call and a pointer that is passed as an argument or stored in
-the memory before the call, a pointer is *captured* by the call if it makes a
-copy of any part of the pointer that outlives the call.
-To be precise, a pointer is captured if one or more of the following conditions
-hold:
+memory before the call, the call may capture two components of the pointer:
+
+  * The address of the pointer, which is its integral value. This also includes
+    parts of the address or any information about the address, including the
+    fact that it does not equal one specific value.
+  * The provenance of the pointer, which is the ability to perform memory
+    accesses through the pointer, in the sense of the :ref:`pointer aliasing
+    rules <pointeraliasing>`. We further distinguish whether only read acceses
+    are allowed, or both reads and writes.
+
+For example, the following function captures the address of ``%a``, because
+it is compared to a pointer, leaking information about the identitiy of the
+pointer:
+
+.. code-block:: llvm
+
+    @glb = global i8 0
+
+    define i1 @f(ptr %a) {
+      %c = icmp eq ptr %a, @glb
+      ret i1 %c
+    }
+
+The function does not capture the provenance of the pointer, because the
+``icmp`` instruction only operates on the pointer address. The following
+function captures both the address and provenance of the pointer, as both
+may be read from ``@glb`` after the function returns:
+
+.. code-block:: llvm
+
+    @glb = global ptr null
+
+    define void @f(ptr %a) {
+      store ptr %a, ptr @glb
+      ret void
+    }
+
+The following function captures *neither* the address nor the provenance of
+the pointer:
+
+.. code-block:: llvm
+
+    define i32 @f(ptr %a) {
+      %v = load i32, ptr %a
+      ret i32
+    }
+
+While address capture includes uses of the address within the body of the
+function, provenance capture refers exclusively to the ability to perform
+accesses *after* the function returns. Memory accesses within the function
+itself are not considered pointer captures.
+
+Comparison of a pointer with a null pointer is generally also considered an
+address capture. As an exception, if the pointer is known to be either null
+or in bounds of an allocated object, it is not considered an address capture.
+As such, the following example does not capture the pointer argument due to
+the presence of the ``dereferenceable_or_null`` attribute:
+
+.. code-block:: llvm
+
+    define i1 @f(ptr dereferenceable_or_null(4) %a) {
+      %c = icmp eq ptr %a, null
+      ret i1 %c
+    }
+
+We can further say that the capture only occurs through a specific location.
+In the following example, the pointer (both address and provenance) is captured
+through the return value only:
+
+.. code-block:: llvm
+
+    define ptr @f(ptr %a) {
+      %gep = getelementptr i8, ptr %a, i64 4
+      ret ptr %gep
+    }
+
+However, we always consider direct inspection of the pointer address
+(e.g. using ``ptrtoint``) to be location-independent. The following example
----------------
arichardson wrote:

Does this mean `ptrtoint` unconditionally captures provenance? I'd like to be able to avoid this for CHERI where it would only be an address capture.

But I guess we avoid this by making sure to always use our local `llvm.cheri.cap.address.get` intrinsic instead of `ptrtoint` (although ptrtoint is more helpful for optimizations/known bits).

https://github.com/llvm/llvm-project/pull/116990


More information about the llvm-commits mailing list