[llvm] LangRef: allocated objects can grow (PR #141338)
via llvm-commits
llvm-commits at lists.llvm.org
Sat May 24 02:08:30 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-llvm-ir
Author: Ralf Jung (RalfJung)
<details>
<summary>Changes</summary>
Based on discussion with @<!-- -->nikic. Also Cc @<!-- -->nunoplopes.
This enables the (reasonably common) pattern of using `mmap` to reserve but not actually map a wide range of pages, and then only adding in more pages as memory is actually needed. Effectively, that region of memory is one big allocated object for LLVM, but crucially, that allocated object *changes its size*.
Having an allocated object grow seems entirely compatible with what LLVM optimizations assume, *except* that when LLVM sees an `alloca` or similar instruction, it will assume that a pointer that has been `getelementptr inbounds` by more than the size of the allocated object cannot alias that `alloca`. But for allocated objects that are created e.g. by `mmap`, where LLVM does not know their size, this cannot happen anyway.
The other main point to be concerned about is having a `getelementptr inbounds` that is moved up across an operation that grows an allocated object: this should be legal as `getelementptr` is freely reorderable. We achieve that by saying that for allocated objects that change their size, "inbounds" means "inbounds of their maximal size", not "inbounds of their current size".
It would be nice to also allow shrinking allocations (e.g. by `munmap`ing pages at the end), but that is more tricky. Consider an example like this:
- load 4 bytes from `ptr`
- call some function
- load 1 byte from `ptr`
Right now, LLVM could argue that since `ptr` clearly has not been deallocated, there must be at least 4 bytes of dereferenceable memory behind `ptr` after the call. If allocations can shrink, this kind of reasoning is no longer valid. I don't know if LLVM actually does reasoning like that -- I think it should not, since I think it should be possible to have allocations that shrink -- but to remain conservative I am not proposing that as part of this patch.
---
Full diff: https://github.com/llvm/llvm-project/pull/141338.diff
1 Files Affected:
- (modified) llvm/docs/LangRef.rst (+10)
``````````diff
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 343ca743c74f8..adc38154e7161 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3327,6 +3327,14 @@ behavior is undefined:
- the size of all allocated objects must be non-negative and not exceed the
largest signed integer that fits into the index type.
+Allocated objects that are created with operations recognized by LLVM (such as
+:ref:`alloca <i_alloca>`, heap allocation functions marked as such, and global
+variables) may *not* change their size. However, allocated objects can also be
+created by means not recognized by LLVM, e.g. by directly calling ``mmap``.
+Those allocated objects are allowed to grow, as long as they always satisfy the
+properties described above. Currently, allocated objects are not permitted to
+ever shrink, nor can they have holes.
+
.. _objectlifetime:
Object Lifetime
@@ -11870,6 +11878,8 @@ if the ``getelementptr`` has any non-zero indices, the following rules apply:
:ref:`based <pointeraliasing>` on. This means that it points into that
allocated object, or to its end. Note that the object does not have to be
live anymore; being in-bounds of a deallocated object is sufficient.
+ If the allocated object can grow, then the relevant size for being *in
+ bounds* is the maximal size the object will ever have, not its current size.
* During the successive addition of offsets to the address, the resulting
pointer must remain *in bounds* of the allocated object at each step.
``````````
</details>
https://github.com/llvm/llvm-project/pull/141338
More information about the llvm-commits
mailing list