[llvm] LangRef: allocated objects can grow (PR #141338)
Ralf Jung via llvm-commits
llvm-commits at lists.llvm.org
Wed May 28 04:39:09 PDT 2025
================
@@ -3327,6 +3327,19 @@ behavior is undefined:
- the size of all allocated objects must be non-negative and not exceed the
largest signed integer that fits into the index type.
+Allocated objects that are created with operations recognized by LLVM (such as
+:ref:`alloca <i_alloca>`, heap allocation functions marked as such, and global
+variables) may *not* change their size. (``realloc``-style operations do not
+change the size of an existing allocated object; instead, they create a new
+allocated object. Even if the object is at the same location as the old one, old
+pointers cannot be used to access this new object.) However, allocated objects
+can also be created by means not recognized by LLVM, e.g. by directly calling
+``mmap``. Those allocated objects are allowed to grow to the right (i.e.,
+keeping the same base address, but increasing their size) while maintaining the
+validity of existing pointers, as long as they always satisfy the properties
+described above. Currently, allocated objects are not permitted to grow to the
+left or to shrink, nor can they have holes.
----------------
RalfJung wrote:
> Given the restrictions, the compiler can't tell where the "beginning" of the object is, so I'm not sure forbidding growth to the left has any meaningful effect.
It does have the one effect that it simplifies specifying when `getelementptr inbounds` is allowed for this allocated object. In particular if we combine this with shrinking an object, we can otherwise have a situation like:
- let's say an allocation start at `addr` and ends at `end` as it covers 1/3 of the address space
- at moment A, it is okay to `inbounds` offset a pointer by 1/3 of the address space to the right from `addr`
- then we shrink the allocation from the left, keeping only its last page
- then we grow the allocation to the right
- at moment B, it is okay to `inbounds` offset a pointer by 1/3 of the address space to the right from `end`
But pointer offset is meant to be freely reorderable, so we could move the two offsets next to each other. We also can combine adjacent `inbounds` offset, preserving `inbounds`. But not somehow we have an `inbounds` offset covering 2/3 of the address space which should be impossible!
> I'm not sure what a "hole" is, in this context. I don't think we require that all bytes of an object have to be dereferenceable. It might make sense to forbid overlapping live objects, though.
By "hole" I mean e.g. `munmap`ing a page from the middle of an otherwise contiguous range of allocated pages, while still considering the entire thing to be a single allocation. That seems like it would open a can of worms, e.g. optimizations could no longer assume things like:
- load from `%ptr`
- let `%ptr2` be `%ptr` offset by 8k bytes (without `inbounds`)
- load from `%ptr2`
- Now we know the range between the two pointers is dereferenceable
Also, what exactly does `getelementptr inbounds` mean in the presence of holes?
https://github.com/llvm/llvm-project/pull/141338
More information about the llvm-commits
mailing list