[LLVMdev] Stack implementation

Mon Jul 25 09:58:47 PDT 2011

Piotr Kaleta <piotrek.kaleta at gmail.com> writes:

> I'm translating the source of stack-based virtual machine into LLVM IR and
> my plan is to implement  the stack in LLVM IR (using alloca/load/store) in
> order to emulate the VM's stack and then use the optimization phase
> "mem2reg". Therefore I'm going to have a stack pointer that points to the
> top of my stack. I'm curious whether I will have to implement that in a
> linked-list fashion, i.e. each stack node would contain the pointer to the
> previous node as well as the current value of the top of the stack or maybe
> alloca instruction works in a way that the subsequent calls to it allocates
> the contiguous memory area and I could get rid of the "pointer to previous
> thing" and only count the number of calls to alloca instruction.
>
> Another thing is that there might be a better (less expensive or easier) way
> of doing the whole stack -> SSA transformation which I didn't came up with,
> but someone else did?

My compiler also emits instructions for a virtual stack-based
machine. The LLVM backend is implemented as a translator from those
instructions to LLVM IR.

In principle, creating an alloca for every value that is pushed into the
stack would work: when the VM should push a value to the stack, you
create the alloca and store the value into it; then you load the alloca
when the virtual machine should pop the value from the stack. You don't
need linked lists, just a stack (!) data structure that emulates the
state of the VM, where you push and pop alloca's as the VM does with the
real values. In essence, your translator would work like an interpreter
over the VM instructions that, instead of working with real data, works
with the alloca's where the real values will be stored when the LLVM
code is executed. Apart from that, you don't need to care about the
alloca's anymore.

The above method should work but has the inconvenience of generating
lots of alloca/store/load instructions. That's is fine if you don't care
too much about compilation time or readability of the unoptimized LLVM
IR code. My translator uses general LLVM values instead of just allocas,
so code like this:

a := (b + c) * d

that would translate to this with the alloca method:

(b, c and d are values already on the stack)

# pop the alloca for b
# pop the alloca for c
# generate LLVM load (for b)
# generate LLVM load (for c)
# generate LLVM instruction for b + c
# create LLVM alloca for result (let's call it T)
# push the alloca
# generate the LLVM store for T into the alloca
# pop the alloca for T
# pop the alloca for d
# generate LLVM load (for T)
# generate LLVM load (for d)
# generate LLVM instruction for T * d
# create LLVM alloca for result (let's call it U)
# push the alloca
# generate the LLVM store for U into the alloca
# pop
# generate LLVM load for U
# generate LLVM store for U into memory reserved for variable `a'

And with the general LLVM instruction method:

# pop (for b)
# pop (for c)
# generate LLVM instruction for (b + c)
# push the resulting LLVM instruction
# pop (for the previous result)
# pop (for d)
# generate LLVM instruction for T * d
# generate LLVM instruction for storing the previous result into the
  memory reserver for variable `a'

If any `b', `c', `d' are variables, you need to detect that and generate
the corresponding `load'. Note that `push' and `pop' are operations on
the stack of your translator, not LLVM instructions.

The key here is to realize that any LLVM instruction that yields a value
(imul, add, etc) *is* a LLVM Value object, the same as the result of a
`load' instruction.

HTH.