[LLVMdev] MCJIT versus getLazyBitcodeModule?

Tue Jan 21 09:51:44 PST 2014

Hi Larry,

I'm pretty sure MCJIT won't do what you need without some changes to the way you're doing things.

When MCJIT compiles a Module, it compiles the entire Module and tries to resolve any and all undefined symbols.  I'm not familiar with getLazyBitcodeModule, but at a glance (and cross referencing your comments below) it seems that it tries to add GlobalValues to a Module as they are needed.  MCJIT doesn't let you modify Modules once it has compiled them, so that's not going to work.  Even if we built some scheme into MCJIT to materialize things before it compiled a Module it would end up materializing everything, so that wouldn't help you.

You have a few options.

1. You can continue to load the pre-existing bitcode with getLazyBitcodeModule then emit your dynamic code into a separate Module which gets linked against the "lazy" Module before it is handed off to MCJIT.

2. You can use MCJIT's object caching mechanism to load a fully pre-compiled version of your bitcode.  Again you'd need to have your dynamic code in a separate Module, but in this case MCJIT would take care of the linking.  If you know the target architecture ahead of time you can install the cached object with your application.  If not, you'd need to take the large compilation hit once.  After that it should be fairly fast.  The downside is that you'd potentially have a lot more code loaded into memory than you needed.

3. You can break the pre-compiled code into smaller chunks and compile them into an archive file.  MCJIT recently added the ability to link against archive files.  This would give you control over the granularity at which pieces of your pre-compiled code get loaded while also giving you the speed of the cached object file solution.  The trade-off is that for this solution you do need to know the target architecture ahead of time.

Hope this helps.

-Andy

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Larry Gritz
Sent: Monday, January 20, 2014 11:29 AM
To: LLVM Developers Mailing List
Subject: [LLVMdev] MCJIT versus getLazyBitcodeModule?

I'm having a problem with MCJIT (in LLVM 3.3 and 3.4), in which it's not resolving symbol mangling in a precompiled bitcode in the same way as old JIT. It's possible that it's just my misunderstanding. Maybe somebody can spot my problem, or identify it as an MCJIT bug.

Here's my situation, in a nutshell:

* I am assembling IR and JITing in my app. The IR may potentially make calls to a large body of code that I precompile to bitcode using "clang++ -S --emit-llvm", then create a .cpp file containing the bitcode, which is compiled into my app.

* Before JITing the dynamic code, my app initializes the Module like this:

    llvm::MemoryBuffer* buf =
        llvm::MemoryBuffer::getMemBuffer (llvm::StringRef(bitcode, bitcode_size), name);
    llvm::Module *m = llvm::getLazyBitcodeModule (buf, context(), err);

  where bitcode is a big char array holding the precompiled bitcode. The idea is to 
  "seed" the module with that precompiled bitcode so that any calls I inserted into the IR
  will work properly.

* When I JIT, I just refer to functions in the bitcode like "foo", if that's what I called it in the original .cpp file that was turned into bitcode.

* Traditionally, I have created a JIT execution engine like this:

    m_llvm_exec = llvm::ExecutionEngine::createJIT (module(), err,
                                    jitmm(), llvm::CodeGenOpt::Default,
                                    /*AllocateGVsWithCode*/ false);

All has worked fine, this is a system that's seen heavy production use for a couple years now.

Now I'm trying to make this codebase work with MCJIT, and I've run into some trouble.  Here's how I'm setting up the ExecutionEngine for the MCJIT case:

    m_llvm_exec = llvm::EngineBuilder(module())
                            .setEngineKind(llvm::EngineKind::JIT)
                            .setErrorStr(err)
                            .setJITMemoryManager(jitmm())
                            .setOptLevel(llvm::CodeGenOpt::Default)
                            .setUseMCJIT(USE_MCJIT)
                            .create();

USE_MCJIT is 1 when I'm building the code to use MCJIT. I'm initializing the buffer and seeding it with the precompiled bitcode in the same way as always, as outlined above.

The basic problem is that it's not finding the symbols in that bitcode..  I get an error message back like this:

	Program used external function '_foo' which could not be resolved!

So it seems that it's an issue of whether or not the underscore prefix is included when looking up the function from the module, and old JIT and MCJIT disagree.

Furthermore, if I change the creation of the module from using llvm::getLazyBitcodeModule to this:

    llvm::Module *m = llvm::ParseBitcodeFile (buf, context(), err);

it works just fine.  But of course, I'd really like to deserialize this bitcode file lazily, because it's got a ton of functions potentially called by my IR, but any given bit of code that I'm JITing only uses a tiny subset, so the JIT speed has greatly reduced overhead (10-20x!) when using the lazy option, so that's considered fairly critical for our app.

So, in short:

   old JIT + ParseBitcodeFile = works
   old JIT + getLazyBitcodeModule = works
   MCJIT + ParseBitcodeFile = works
   MCJIT + getLazyBitcodeModule = BROKEN

Does anybody have advice? Thanks in advance for any help.

--
Larry Gritz
lg at larrygritz.com

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev