[LLVMdev] What is on the LLVM horizon for truly relocatable JITted code?

Mon Feb 16 13:47:49 PST 2015

On 16 Feb 2015, at 19:47, Christian Schafmeister <chris.schaf at verizon.net> wrote:
> 
> I’ve written a Common Lisp compiler (currently called Clasp:  https://github.com/drmeister/clasp) in C++ that uses LLVM as the backend and interoperates with C++.  It uses copying garbage collection via the Memory Pool System (MPS) garbage collector by Ravenbrook.  This garbage collector is precise on the heap and conservative on the stack.
> 
> Currently I JIT code to wherever LLVM drops the code and it remains fixed in memory.  This causes some problems for implementing a dynamic language like Common Lisp because CL considers data and code to be equivalent.
> 
> I’d like to move the code into the MPS managed memory and be able to apply copying garbage collection to it.  Is this possible?   Will it ever be possible?

I'm not sure what this actually buys you.  There are a few reasons why you don't want to treat code compiled and data in the same way:

- You want code to be executable but not writeable

- Code doesn't typically support the infant mortality hypothesis (things in eval and so on can be special-cased in a short-lived allocation, the source cached, and recompiled if they persist more efficiently than trying to move code around)

- Code can't be scanned for roots in the same way as data as immediate pointer values may be materialised across multiple instructions (not important if your code only ever refers to globals via arguments).

- The set of reachable objects from a piece of compiled code never changes over its entire lifetime.

- You should never have to deal with interior pointers to code, except return addresses on the stack.

For this reason, most modern systems that do run-time code generation have one or more special regions for code, often with copies of the pointers stored alongside.  There's nothing stopping you from relocating a closure object that refers to a function with your data GC and using the deallocation event to trigger the RTDyldMemoryManager to be allowed to recycle the memory.  As the existing infrastructure permits you to relocate code to its final position after initial code generation, you can pick a suitable size from your free list.

David