[LLVMdev] a life-cycle question for MCJIT

Thu Jun 4 16:15:00 PDT 2015

Context:

We use MCJIT to generate machine code in our LLVM based JIT compiler.
The code generation process has roughly 5 steps:

 0. Generate and optimize LLVM IR.
 1. Call generateCodeForModule on the output of (0) to translate LLVM
    IR to machine code.
 2. Figure out the final locations for the code and data generated by
    MCJIT using an allocator specific to our runtime.  Make
    mapSectionAddress calls to convey this information to MCJIT.
 3. Call finalizeObject() to apply relocations.
 4. Copy over the relocated code to buffers allocated by our custom
    allocator.

The problem:

After running step (1) we may, in rare cases, decide that the
generated code is not usable by our runtime [*], and we have to
"abort" the compile.  However, step (1) populates
Dyld->ExternalSymbolRelocations (and possibly other similar data
structures) with the set of pending relocations, and the
ExecutionEngine interface provides no way of cleaning this up without
running (3).  Since we use a single long-living instance of MCJIT per
compiler thread, when we abort the compile after running
(1) and before running (3) these relocations get applied to future
compiles and cause problems.

To get around this issue, is it reasonable to add a hook to the
ExecutionEngine interface that resets the state of an MCJIT (and
containing RuntimeDyld) instance from after (1) to before it?

A second potential solution is to "pretend" to run through steps (2)
and (3) to have MCJIT and RuntimeDyld clear their internal states; but
I'd prefer not going this route if it can be avoided.

[*]: why this happens is not important to this discussion, but it is
sufficient to note that a) we cannot reliably predict this before
running step (1) and b) there are no simple tweaks that will prevent
this from happening.

-- Sanjoy