[llvm-dev] Realloca ?

Mon Sep 21 02:57:14 PDT 2020

On 19/09/2020 17:26, james faure via llvm-dev wrote:
 > Ideally, I want a guarantee that successive alloca's will be laid out 
consecutive in memory.
 >
 > The context is that I need to buffer streams in a stack-machine; I 
don't know the buffer size in advance and may need to grow it. It's of 
little consequence to me that the buffer needs to be written backwards 
to match the stack's growing direction.
 >
 > I noticed that conceptually, llvm.stacksave + gep + llvm.stackrestore 
is a possible solution, but that is probably undefined behavior.?

Do you actually need a contiguous buffer?  You can place alloca 
instructions in a loop to allocate more stack space and chain together 
allocations together in a linked list - they won't be freed until either 
an llvm.stackrestore call or until the function returns.  Note that, in 
general, allocating things of unbounded size on the stack is not a good 
idea because the stack is of finite size and much smaller than the heap. 
  You might be better off implementing a bump allocator.

An IR-level feature for growing a single stack allocation would be 
tricky to implement.  You'd need to guarantee a single growable alloca 
per function (it would have to be the last allocated object) and that, 
in turn, would prevent inlining of a function that contained a growable 
allocation into another that provided one.

You'd also be placing a restriction on the back end that this alloca 
must be the last thing allocated on the stack.  Depending on the 
addressing modes used for the stack, this may be a problem for spills, 
so it would need to be carefully enabled for every architecture.

You already point out the problem with stack growth direction: it's not 
a very generally useful feature because it depends on consumers that 
want the end pointer to be stable and to grow a buffer at the beginning. 
  Growing a buffer at the beginning would require updating all GEPs 
derived from PHI nodes that could point to the old or new beginning, so 
you'd need to ensure a canonical form where the instruction following 
the alloca / realloca was a GEP to the end and every subsequent GEP came 
from the latter, not the former.

LLVM's alias analysis knowns that two allocas can't alias.  If you're 
not suing inbounds GEPs, you may be able to get away with the stacksave 
GEP stackrestore trick, but it's undefined behaviour to refer to an 
alloca after it has been deallocated and so you're relying on the 
optimiser not figuring out that this is what you're doing.  You're also 
relying on the back end lowering in that order (there are no guarantees 
made on stack frame layout by the back end and vulnerability mitigation 
techniques may break your expectations in fun ways).

David