[llvm] LangRef: allocated objects can grow (PR #141338)

Sat May 24 02:08:30 PDT 2025

llvmbot wrote:




@llvm/pr-subscribers-llvm-ir

Author: Ralf Jung (RalfJung)

<details>
<summary>Changes</summary>

Based on discussion with @nikic. Also Cc @nunoplopes.

This enables the (reasonably common) pattern of using `mmap` to reserve but not actually map a wide range of pages, and then only adding in more pages as memory is actually needed. Effectively, that region of memory is one big allocated object for LLVM, but crucially, that allocated object *changes its size*.

Having an allocated object grow seems entirely compatible with what LLVM optimizations assume, *except* that when LLVM sees an `alloca` or similar instruction, it will assume that a pointer that has been `getelementptr inbounds` by more than the size of the allocated object cannot alias that `alloca`. But for allocated objects that are created e.g. by `mmap`, where LLVM does not know their size, this cannot happen anyway.

The other main point to be concerned about is having a `getelementptr inbounds` that is moved up across an operation that grows an allocated object: this should be legal as `getelementptr` is freely reorderable. We achieve that by saying that for allocated objects that change their size, "inbounds" means "inbounds of their maximal size", not "inbounds of their current size".

It would be nice to also allow shrinking allocations (e.g. by `munmap`ing pages at the end), but that is more tricky. Consider an example like this:
- load 4 bytes from `ptr`
- call some function
- load 1 byte from `ptr`

Right now, LLVM could argue that since `ptr` clearly has not been deallocated, there must be at least 4 bytes of dereferenceable memory behind `ptr` after the call. If allocations can shrink, this kind of reasoning is no longer valid. I don't know if LLVM actually does reasoning like that -- I think it should not, since I think it should be possible to have allocations that shrink -- but to remain conservative I am not proposing that as part of this patch.

---
Full diff: https://github.com/llvm/llvm-project/pull/141338.diff


1 Files Affected:

- (modified) llvm/docs/LangRef.rst (+10) 


``````````diff

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 343ca743c74f8..adc38154e7161 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3327,6 +3327,14 @@ behavior is undefined:
 -  the size of all allocated objects must be non-negative and not exceed the
    largest signed integer that fits into the index type.
 
+Allocated objects that are created with operations recognized by LLVM (such as
+:ref:`alloca <i_alloca>`, heap allocation functions marked as such, and global
+variables) may *not* change their size. However, allocated objects can also be
+created by means not recognized by LLVM, e.g. by directly calling ``mmap``.
+Those allocated objects are allowed to grow, as long as they always satisfy the
+properties described above. Currently, allocated objects are not permitted to
+ever shrink, nor can they have holes.
+
 .. _objectlifetime:
 
 Object Lifetime
@@ -11870,6 +11878,8 @@ if the ``getelementptr`` has any non-zero indices, the following rules apply:
    :ref:`based <pointeraliasing>` on. This means that it points into that
    allocated object, or to its end. Note that the object does not have to be
    live anymore; being in-bounds of a deallocated object is sufficient.
+   If the allocated object can grow, then the relevant size for being *in
+   bounds* is the maximal size the object will ever have, not its current size.
  * During the successive addition of offsets to the address, the resulting
    pointer must remain *in bounds* of the allocated object at each step.
 

``````````

</details>


https://github.com/llvm/llvm-project/pull/141338