[LLVMdev] MCJIT versus getLazyBitcodeModule?

Gaël Thomas gael.thomas at lip6.fr
Tue Jan 21 13:17:10 PST 2014


Hi Larry,

Inlining from remote modules with MCJIT is not so easy, but possible
(at least it works for me). I'm working since two days on this problem
(from an end-user perspective, I'm not a llvm developer:)). As it can
help you (and other people), I explain what I have done (my mail is
maybe too long for the mailing list, sorry!).

So, basically, inlining from other modules (runtime module included)
is possible in MCJIT. The solution is maybe a little bit ugly... Just
to explain what I do and my problems, I'm involved in the development
of vmkit (a library to build virtual machines). I have to inline
runtime functions defined in c++ to achieve good performance (for
example the type checker for j3, the Java virtual machine developed
with vmkit). I think that your problem is not so far from mine (I also
reload my own bitcode when I start vmkit).

So, I give you the picture (I can also send you my llvm pass or other
relevant code if you need them). It can help as a starting point. I
wrote the inling pass today, so it's maybe still buggy :).

Basically, I have two kind of modules: a module that contains the
runtime functions (defined in c++) and the other modules that contain
functions that I have to jit compile. To simplify, let say that I have
only one module to jit. In the jit-module, I want to call functions
defined in the runtime-module. I have thus three problems to solve:
* The verifier does not like when you call a function defined in the
runtime module directly from the jit module (it prevents external
references to other modules). So, I have to avoid this as much as
possible.
* The jited module has to find the llvm code of the runtime functions
for inlining
* When a function is not inlined, you have to provide the address of
the function to MCJIT (I use dlsym for that purpose).

What I do:
- MCJIT only manages the jit-module (the runtime-module is not
associated to MCJIT through addModule)
- When I have to call a runtime function from the jit-module, I define
an external reference to the function in the jit-module. Something
like:

llvm::Function* orig = runtimeModule->getFunction("my-function");
llvm::Function* copy =
(llvm::Function*)jitModule->getOrInsertFunction(orig->getName(),
orig->getFunctionType());

This step is not mandatory as you will see after (but I have not
tested a direct use of remote references).

- Then I use a llvm pass (a FunctionPass). For each function, I
explore each of the CallSite. If the callsite goes to a function that
does not have a definition (i.e., a runtime function), I find the
original llvm::Function*. I use something like that:

  bool FunctionInliner::runOnFunction(llvm::Function& function) {
    bool Changed = false;

    for (llvm::Function::iterator bit=function.begin();
bit!=function.end(); bit++) {
      llvm::BasicBlock* bb = bit;

      for(llvm::BasicBlock::iterator it=bb->begin(); it!=bb->end();) {
        llvm::Instruction *insn = it++;

        if (insn->getOpcode() != llvm::Instruction::Call &&
            insn->getOpcode() != llvm::Instruction::Invoke) {
          continue;
        }

        llvm::CallSite  call(insn);
        llvm::Function* callee = call.getCalledFunction();

        if(!callee)
          continue;

        if(callee->isDeclaration()) { /* maybe a foreign function? */
          llvm::Function* original =
runtimeModule->getFunction(callee->getName());
          if(original) {
            /* if you use lazybitcode..., don't forget to materialize
the original here with */
            original->Materialize();

At this step, you can directly inline your code if you want to
systematically inline code:
           llvm::InlineFunctionInfo ifi(0);
           bool isInlined = llvm::InlineFunction(call, ifi, false);
           Changed |= isInlined;

Or, if you don't want to always inline the code, you can guard the
inlining after having used the inline analysis pass:
   llvm::InlineCostAnalysis  costAnalysis;
   llvm::InlineCost cost = costAnalysis.getInlineCost(call, 42); /* 42
is the threshold */
   if(cost.isAlways()) || (!cost.isNever() && (cost))) {
     /* inlining goes here */
   }

After this step, you have a problem. The inlined function can itself
contain calls to the runtime functions. So, at this step, it's ugly
because I have a function that potentially contains external
references... What I do, I simply re-explore the code with
    if(isInlined) {
       it = bb->begin();
       continue;
    }

and for each function, if its defining module is not the jitModule, a
replace the call with a local call. Something like that:

        if(callee->getParent() != function.getParent()) {
          llvm::Function* local =
(llvm::Function*)function.getParent()->getOrInsertFunction(callee->getName(),
callee->getFunctionType());
          callee->replaceAllUsesWith(local);
          Changed = 1;
        }

After this step, you will have a module that only contains local
references and that contain your prefered runtime code inlined.

- Now, you have to solve the last problem, finding symbols from the
runtimeModule when they are not inlined (global values or functions).
In my case, I have defined my own SectionMemoryManager:

  class CompilationUnit  : public llvm::SectionMemoryManager {
    uint64_t getSymbolAddress(const std::string &Name) {
      return (uint64_t)dlsym(SELF_HANDLE, Name.c_str() + 1);
        /* + 1 with MacOS, + 0 with Linux */
    }
  }

which is called by MCJIT to resolve external symbols when the jited
module is loaded in memory (you have to use
EngineBuilder.setMCJITMemoryManager).

If, like me, you want to also inline functions from jited modules,
it's a little bit more tricky because the llvm::Function* original =
runtimeModule->getFunction(callee->getName()); is not enough. I have
defined my own symbol table (a hash map) that associates function
identifiers with a structure that contains both the original llvm
function of the callee and its address in memory (also used in the
SectionMemoryManager).

Good luck :)
Gaël




2014/1/21 Larry Gritz <lg at larrygritz.com>:
> Thanks for the pointers.
>
> Am I correct in assuming that putting the precompiled bitcode into a second module and linking (or using the object caches) would result in ordinary function calls, but would not be able to inline the functions?
>
>         -- lg
>
>
> On Jan 21, 2014, at 11:55 AM, Kaylor, Andrew <andrew.kaylor at intel.com> wrote:
>
>> I would say that the incompatibility is by design.  Not that anyone specifically wanted the incompatibility, but rather it's a known artifact of the MCJIT design.
>>
>> You can find an example of MCJIT's object caching here: http://blog.llvm.org/2013/08/object-caching-with-kaleidoscope.html
>>
>> The two blog entries before that may also be of use to you: http://blog.llvm.org/2013_07_01_archive.html
>>
>> I don't where you can find an example of the Module linking I described, but I think llvm::Linker is the class to look at.
>>
>> -Andy
>>
>
> --
> Larry Gritz
> lg at larrygritz.com
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



-- 
-------------------------------------------------------------------
Gaël Thomas, Associate Professor, UPMC
http://pagesperso-systeme.lip6.fr/Gael.Thomas/
-------------------------------------------------------------------




More information about the llvm-dev mailing list