[llvm-dev] [RFC] Adding support for marking allocator functions in LLVM IR

Augie Fackler via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 5 14:32:31 PST 2022

Hi everyone! I’m working on making the Rust compiler being able to track
LLVM HEAD more closely, and as part of that we need to obviate a patch[0]
that teaches LLVM about some Rust allocator implementation details. This
proposal is the product of many conversations and a couple of failed
attempts at simpler implementations.



Rust uses LLVM for codegen, and has its own allocator functions. In order
for LLVM to correctly optimize out allocations we have to tell the
optimizer about the allocation/deallocation functions used by Rust.

Languages supported by Clang, such as C and C++, have stable symbol names
for their allocation functions, which are hardcoded in LLVM[1][2].
Unfortunately, this strategy does not work for Rust, where developers don't
want to commit to a particular symbol name and calling convention yet.



We add two attributes to LLVM IR:

 * `allocator(FAMILY)`: Marks a function as part of an allocator family,
named by the “primary” allocation function (e.g. `allocator(“malloc”)`,
`allocator(“_Znwm”)`, or `allocator(“__rust_alloc”)`).

 * `releaseptr(idx)`: Indicates that the function releases the pointer that
is its Nth argument.

These attributes, combined with the existing `allocsize(n[, m])` attribute
lets us annotate alloc, realloc, and free type functions in LLVM IR, which
relieves Rust of the need to carry a patch to describe its allocator
functions to LLVM’s optimizer. Some example IR of what this might look like:

; Function Attrs: nounwind ssp

define i8* @test5(i32 %n) #4 {


  %0 = tail call noalias dereferenceable_or_null(20) i8* @malloc(i32 20) #8

  %1 = load i8*, i8** @s, align 8

  call void @llvm.memcpy.p0i8.p0i8.i32(i8* noundef nonnull align 1
dereferenceable(10) %0, i8* noundef nonnull align 1 dereferenceable(10) %1,
i32 10, i1 false) #0

  ret i8* %0


attributes #8 = { nounwind allocsize(0) "allocator"="malloc" }

Similarly, the call `free(foo)` would get the attributes
`”allocator”=”malloc” releaseptr(1)` and `realloc(foo, N)` gets
`”allocator”=”malloc” releaseptr(1) allocsize(1)`. Note that the
`releaseptr(n)` attribute is 1-indexed to avoid issues with storing zero
values in attributes in my current draft - I’m very open to suggestions to
change that, this just seemed like the right solution rather than adding
getters/setters everywhere to increment/decrement a value.



In addition to the benefits for Rust, the LLVM optimizer could also be
improved to not optimize away defects like


  auto *foo = new Thing();



which would then correctly crash instead of silently “working” until
something actually uses the allocation. Similarly, there’s a potential
defect when only one side of an overridden operator::new and
operator::delete is visible to the optimizer and inlineable, which can look
indistinguishable from the above after inlining.

This also probably opens the door to fixing issues like
https://bugs.llvm.org/show_bug.cgi?id=49022 caused by overloading the
`builtin` annotation on allocator functions, but I’m unlikely to continue
in that direction.

What do people think?





