<div dir="ltr">Oh that's a good point, making any changes in bitcode is a lot easier than once it's gone down to elf.<div><br></div><div>Taking a brief look at InlineCost.cpp, it doesn't seem like InlineCostAnalysis is actually using anything about the callgraph.  The only thing it needs is a TargetTransformInfo, which it gets from runOnSCC(); it seems to actually work ok for me to hackily just put it into a separate PassManager and run it on an empty module, which initializes the local state appropriately.</div>


<div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jan 26, 2014 at 5:29 AM, Gaël Thomas <span dir="ltr"><<a href="mailto:gael.thomas@lip6.fr" target="_blank">gael.thomas@lip6.fr</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Kevin,<br>

<br>

I haven't tested yet ObjectCache, but I faced exactly the same issue<br>

with hidden symbols :) As a solution, I run a small module pass on<br>

each runtime module (aka, .bc file), which modifies the linkages. I<br>

run the pass before compiling bc files into .o. I have thus these<br>

rules in my compilation process:<br>

<br>

file.cc --> file-raw.bc --> file.bc --> file.o<br>

<br>

file-raw.bc: file.cc => clang++ -emit-llvm<br>

file.bc: file-raw.bc => opt with my pass<br>

file.o: file.bc => llc<br>

<br>

For hidden functions, it's easy : I replace linkonce_odr functions by<br>

weak_odr functions. The semantic is exactly the same except that the<br>

symbol is visible with dlsym in the resulting binary. For strings,<br>

it's a little bit more complicated because you can have collisions<br>

between names in different modules. So, I rename the strings in my<br>

pass in order to ensure that the name is unique, and I replace the<br>

InternalLinkage with an ExternalLinkage. It's far from perfect because<br>

it slows down dlsym (the time to find a symbol is proportional to the<br>

number of external symbols).<br>

<br>

If you need the code of the pass, you can find it in my branch of vmkit:<br>

<a href="http://llvm.org/svn/llvm-project/vmkit/branches/mcjit" target="_blank">http://llvm.org/svn/llvm-project/vmkit/branches/mcjit</a><br>

in lib/vmkit-prepare-code/adapt-linkage.cc<br>

<br>

Otherwise, I made a mistake in my previous mail: we can not use the<br>

llvm::InlineCostAnalysis as is (and thus, we can not reuse the<br>

heuristics able to compute the cost of inlining). The inline cost<br>

analyzer has to explore the whole call flow graph and it's not so easy<br>

when functions are defined in multiple modules (and I don't want to<br>

explore the whole graph for each JITted function!). So, for the<br>

moment, I only inline functions marked as AlwaysInline. I don't know<br>

what I will do for this problem...<br>

<br>

Gaël<br>

<br>

<br>

2014-01-26 Kevin Modzelewski <<a href="mailto:kmod@dropbox.com" target="_blank">kmod@dropbox.com</a>>:<br>

<div><div>> Hi Gael, I tried converting to your approach but I had some issues making<br>

> sure that all symbols accessed by the jit modules have entries in the<br>

> dynamic symbol table.<br>

><br>

> To be specific, my current approach is to use MCJIT (using an objectcache)<br>

> to JIT the runtime module and then let MCJIT handle linking any references<br>

> from the jit'd modules; I just experimented with what I think you're doing,<br>

> and compiling my runtime and directly linking it with the rest of the<br>

> compiler, and then tying together references in the jit modules to entities<br>

> in the compiler.<br>

><br>

> I got it working for the case of "standard" functions and globals, but had<br>

> some trouble with other types of symbols.  I don't know the right<br>

> terminology for these things, but I couldn't get methods defined in headers<br>

> (ex: a no-op virtual destructor) to work properly.  I guess that's not too<br>

> hard to work around by either putting it into a cpp file or maybe with some<br>

> objcopy magic, but then I ran into the issue of string constants.  Again, my<br>

> knowledge of the terminology isn't great, but it looks like those don't get<br>

> symbols in the object file but they get their own sections, and since I have<br>

> multiple source files that I llvm-link together, the constants get renamed<br>

> in the LLVM IR and have no relation to the section names.  Maybe there's a<br>

> workaround by compiling all my runtime sources as a single file so no<br>

> renaming happens, and then some hackery to get the section names exported,<br>

> but I guess I'm feeling a little doubtful about it.<br>

><br>

> Have you tried using an ObjectCache and pre-jitting [I still have a hard<br>

> time using that term with a straight face] the runtime module?  My runtime<br>

> isn't that large (about 4kloc), but the numbers I'm getting are that it<br>

> takes about 2ms for the getLazyBitcodeModule call, and about 4ms to load the<br>

> stdlib through the ObjectCache.  I'm not sure how these numbers scale with<br>

> the size of the runtime, but it feels like if the ObjectCache loading is too<br>

> expensive then loading the bitcode might be as well?  Another idea is that<br>

> you could load+jit the bitcode the first time that you want to inline<br>

> something, since the inlining+subsequent optimizations you probably want to<br>

> do are themselves expensive and could mask the jit'ing time.<br>

><br>

> Anyway, my current plan is to stick with jit'ing the runtime module but cut<br>

> down the amount of stuff included in it, since I'm finding that most of my<br>

> runtime methods end up dispatching on type, and patchpoint-ing at runtime<br>

> seems to be more effective than inlining aot.<br>

><br>

> Kevin<br>

><br>

><br>

> On Tue, Jan 21, 2014 at 1:17 PM, Gaël Thomas <<a href="mailto:gael.thomas@lip6.fr" target="_blank">gael.thomas@lip6.fr</a>> wrote:<br>

>><br>

>> Hi Larry,<br>

>><br>

>> Inlining from remote modules with MCJIT is not so easy, but possible<br>

>> (at least it works for me). I'm working since two days on this problem<br>

>> (from an end-user perspective, I'm not a llvm developer:)). As it can<br>

>> help you (and other people), I explain what I have done (my mail is<br>

>> maybe too long for the mailing list, sorry!).<br>

>><br>

>> So, basically, inlining from other modules (runtime module included)<br>

>> is possible in MCJIT. The solution is maybe a little bit ugly... Just<br>

>> to explain what I do and my problems, I'm involved in the development<br>

>> of vmkit (a library to build virtual machines). I have to inline<br>

>> runtime functions defined in c++ to achieve good performance (for<br>

>> example the type checker for j3, the Java virtual machine developed<br>

>> with vmkit). I think that your problem is not so far from mine (I also<br>

>> reload my own bitcode when I start vmkit).<br>

>><br>

>> So, I give you the picture (I can also send you my llvm pass or other<br>

>> relevant code if you need them). It can help as a starting point. I<br>

>> wrote the inling pass today, so it's maybe still buggy :).<br>

>><br>

>> Basically, I have two kind of modules: a module that contains the<br>

>> runtime functions (defined in c++) and the other modules that contain<br>

>> functions that I have to jit compile. To simplify, let say that I have<br>

>> only one module to jit. In the jit-module, I want to call functions<br>

>> defined in the runtime-module. I have thus three problems to solve:<br>

>> * The verifier does not like when you call a function defined in the<br>

>> runtime module directly from the jit module (it prevents external<br>

>> references to other modules). So, I have to avoid this as much as<br>

>> possible.<br>

>> * The jited module has to find the llvm code of the runtime functions<br>

>> for inlining<br>

>> * When a function is not inlined, you have to provide the address of<br>

>> the function to MCJIT (I use dlsym for that purpose).<br>

>><br>

>> What I do:<br>

>> - MCJIT only manages the jit-module (the runtime-module is not<br>

>> associated to MCJIT through addModule)<br>

>> - When I have to call a runtime function from the jit-module, I define<br>

>> an external reference to the function in the jit-module. Something<br>

>> like:<br>

>><br>

>> llvm::Function* orig = runtimeModule->getFunction("my-function");<br>

>> llvm::Function* copy =<br>

>> (llvm::Function*)jitModule->getOrInsertFunction(orig->getName(),<br>

>> orig->getFunctionType());<br>

>><br>

>> This step is not mandatory as you will see after (but I have not<br>

>> tested a direct use of remote references).<br>

>><br>

>> - Then I use a llvm pass (a FunctionPass). For each function, I<br>

>> explore each of the CallSite. If the callsite goes to a function that<br>

>> does not have a definition (i.e., a runtime function), I find the<br>

>> original llvm::Function*. I use something like that:<br>

>><br>

>>   bool FunctionInliner::runOnFunction(llvm::Function& function) {<br>

>>     bool Changed = false;<br>

>><br>

>>     for (llvm::Function::iterator bit=function.begin();<br>

>> bit!=function.end(); bit++) {<br>

>>       llvm::BasicBlock* bb = bit;<br>

>><br>

>>       for(llvm::BasicBlock::iterator it=bb->begin(); it!=bb->end();) {<br>

>>         llvm::Instruction *insn = it++;<br>

>><br>

>>         if (insn->getOpcode() != llvm::Instruction::Call &&<br>

>>             insn->getOpcode() != llvm::Instruction::Invoke) {<br>

>>           continue;<br>

>>         }<br>

>><br>

>>         llvm::CallSite  call(insn);<br>

>>         llvm::Function* callee = call.getCalledFunction();<br>

>><br>

>>         if(!callee)<br>

>>           continue;<br>

>><br>

>>         if(callee->isDeclaration()) { /* maybe a foreign function? */<br>

>>           llvm::Function* original =<br>

>> runtimeModule->getFunction(callee->getName());<br>

>>           if(original) {<br>

>>             /* if you use lazybitcode..., don't forget to materialize<br>

>> the original here with */<br>

>>             original->Materialize();<br>

>><br>

>> At this step, you can directly inline your code if you want to<br>

>> systematically inline code:<br>

>>            llvm::InlineFunctionInfo ifi(0);<br>

>>            bool isInlined = llvm::InlineFunction(call, ifi, false);<br>

>>            Changed |= isInlined;<br>

>><br>

>> Or, if you don't want to always inline the code, you can guard the<br>

>> inlining after having used the inline analysis pass:<br>

>>    llvm::InlineCostAnalysis  costAnalysis;<br>

>>    llvm::InlineCost cost = costAnalysis.getInlineCost(call, 42); /* 42<br>

>> is the threshold */<br>

>>    if(cost.isAlways()) || (!cost.isNever() && (cost))) {<br>

>>      /* inlining goes here */<br>

>>    }<br>

>><br>

>> After this step, you have a problem. The inlined function can itself<br>

>> contain calls to the runtime functions. So, at this step, it's ugly<br>

>> because I have a function that potentially contains external<br>

>> references... What I do, I simply re-explore the code with<br>

>>     if(isInlined) {<br>

>>        it = bb->begin();<br>

>>        continue;<br>

>>     }<br>

>><br>

>> and for each function, if its defining module is not the jitModule, a<br>

>> replace the call with a local call. Something like that:<br>

>><br>

>>         if(callee->getParent() != function.getParent()) {<br>

>>           llvm::Function* local =<br>

>><br>

>> (llvm::Function*)function.getParent()->getOrInsertFunction(callee->getName(),<br>

>> callee->getFunctionType());<br>

>>           callee->replaceAllUsesWith(local);<br>

>>           Changed = 1;<br>

>>         }<br>

>><br>

>> After this step, you will have a module that only contains local<br>

>> references and that contain your prefered runtime code inlined.<br>

>><br>

>> - Now, you have to solve the last problem, finding symbols from the<br>

>> runtimeModule when they are not inlined (global values or functions).<br>

>> In my case, I have defined my own SectionMemoryManager:<br>

>><br>

>>   class CompilationUnit  : public llvm::SectionMemoryManager {<br>

>>     uint64_t getSymbolAddress(const std::string &Name) {<br>

>>       return (uint64_t)dlsym(SELF_HANDLE, Name.c_str() + 1);<br>

>>         /* + 1 with MacOS, + 0 with Linux */<br>

>>     }<br>

>>   }<br>

>><br>

>> which is called by MCJIT to resolve external symbols when the jited<br>

>> module is loaded in memory (you have to use<br>

>> EngineBuilder.setMCJITMemoryManager).<br>

>><br>

>> If, like me, you want to also inline functions from jited modules,<br>

>> it's a little bit more tricky because the llvm::Function* original =<br>

>> runtimeModule->getFunction(callee->getName()); is not enough. I have<br>

>> defined my own symbol table (a hash map) that associates function<br>

>> identifiers with a structure that contains both the original llvm<br>

>> function of the callee and its address in memory (also used in the<br>

>> SectionMemoryManager).<br>

>><br>

>> Good luck :)<br>

>> Gaël<br>

>><br>

>><br>

>><br>

>><br>

>> 2014/1/21 Larry Gritz <<a href="mailto:lg@larrygritz.com" target="_blank">lg@larrygritz.com</a>>:<br>

>> > Thanks for the pointers.<br>

>> ><br>

>> > Am I correct in assuming that putting the precompiled bitcode into a<br>

>> > second module and linking (or using the object caches) would result in<br>

>> > ordinary function calls, but would not be able to inline the functions?<br>

>> ><br>

>> >         -- lg<br>

>> ><br>

>> ><br>

>> > On Jan 21, 2014, at 11:55 AM, Kaylor, Andrew <<a href="mailto:andrew.kaylor@intel.com" target="_blank">andrew.kaylor@intel.com</a>><br>

>> > wrote:<br>

>> ><br>

>> >> I would say that the incompatibility is by design.  Not that anyone<br>

>> >> specifically wanted the incompatibility, but rather it's a known artifact of<br>

>> >> the MCJIT design.<br>

>> >><br>

>> >> You can find an example of MCJIT's object caching here:<br>

>> >> <a href="http://blog.llvm.org/2013/08/object-caching-with-kaleidoscope.html" target="_blank">http://blog.llvm.org/2013/08/object-caching-with-kaleidoscope.html</a><br>

>> >><br>

>> >> The two blog entries before that may also be of use to you:<br>

>> >> <a href="http://blog.llvm.org/2013_07_01_archive.html" target="_blank">http://blog.llvm.org/2013_07_01_archive.html</a><br>

>> >><br>

>> >> I don't where you can find an example of the Module linking I<br>

>> >> described, but I think llvm::Linker is the class to look at.<br>

>> >><br>

>> >> -Andy<br>

>> >><br>

>> ><br>

>> > --<br>

>> > Larry Gritz<br>

>> > <a href="mailto:lg@larrygritz.com" target="_blank">lg@larrygritz.com</a><br>

>> ><br>

>> ><br>

>> ><br>

>> ><br>

>> > _______________________________________________<br>

>> > LLVM Developers mailing list<br>

>> > <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

>> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

>><br>

>><br>

>><br>

>> --<br>

>> -------------------------------------------------------------------<br>

>> Gaël Thomas, Associate Professor, UPMC<br>

>> <a href="http://pagesperso-systeme.lip6.fr/Gael.Thomas/" target="_blank">http://pagesperso-systeme.lip6.fr/Gael.Thomas/</a><br>

>> -------------------------------------------------------------------<br>

>><br>

>> _______________________________________________<br>

>> LLVM Developers mailing list<br>

>> <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

><br>

><br>

<br>

<br>

<br>

--<br>

-------------------------------------------------------------------<br>

Gaël Thomas, Associate Professor, UPMC<br>

<a href="http://pagesperso-systeme.lip6.fr/Gael.Thomas/" target="_blank">http://pagesperso-systeme.lip6.fr/Gael.Thomas/</a><br>

-------------------------------------------------------------------<br>

</div></div></blockquote></div><br></div></div>