[llvm-dev] Need help on JIT compilation speed

Lang Hames via llvm-dev llvm-dev at lists.llvm.org
Fri Jun 19 11:57:37 PDT 2020


Hi Terry,

As Praveen mentioned, OrcV2 supports concurrent compilation and lazy
compilation, both of which may help reduce time-to-execution. There are a
number of things to keep in mind as you consider your options though:

(1) Neither OrcV2 nor MCJIT have any special tricks to speed up compilation
of LLVM IR: The IR compilation process is opaque to them, and for the same
input IR they will both take the same time*.
(2) LLVM does not currently support concurrent optimization within a
module: Different modules can be compiled concurrently (provided they are
attached to different LLVMContexts), but two functions in the same module
can not.
(3) Laziness can help to reduce time-to-execution if some of your IR is
either unlikely to be used (in which case you can avoid compiling it
altogether) or unlikely to be used until later in program execution (in
which case you can defer compilation).
(4) Concurrency can help reduce the wall-clock time required for
compilation if you can break your modules up in a suitable way. If you're
relying on whole module optimizations then there are some trade-offs to
consider: breaking up a module to enable concurrent compilation may
eliminate inlining opportunities. Cloning available_externally function
definitions into your module to re-enable inlining opportunities can
address this, but adds overhead of its own. Since it appears that you are
just doing function-at-a-time optimization (without inlining) you may not
have to worry about this.
(5) Some false dependencies still exist in OrcV2s concurrent compilation
system -- these may artificially limit the amount of parallel work you can
do at the moment. I have fixes in mind, but I also have some other features
and bugs to address first, so fixes may not be available for a while.

Finally: I'm not sure whether you're just measuring IR optimization time or
including CodeGen time too, but the best way to reduce the amount of
compilation work to be done is to play around with your optimization
settings and look for optimizations that you can discard without having too
much impact on generated code quality. You'll want to look at both the IR
optimizations and codegen optimization levels for this.

Regards,
Lang.

* Note: OrcV2 and MCJIT will take the same time to compile the same IR once
they reach the compilation stage, however Orc's lazy compilation utilities
will automatically break up modules before the reach the compiler, so you
can't do an apples-to-apples comparison of compile times there.

On Wed, Jun 17, 2020 at 6:47 PM Terry Guo <flameroc at gmail.com> wrote:

> Hi Praveen,
>
> Thanks for your help. I will follow your suggestions and get back if I can
> make some progress.
>
> BR,
> Terry
>
>
> On Wed, Jun 17, 2020 at 12:12 AM Praveen Velliengiri <
> praveenvelliengiri at gmail.com> wrote:
>
>> Hi Terry,
>> CC'ed lang hames he is the best person to answer.
>>
>> In general, ORCv2 is the new and stable JIT environment. In order to have
>> a fast compilation time you can use the -lazy compilation option in ORCv2,
>> this will result in fast compile time and interleave compile time with
>> execution time. You can also use the concurrent compilation option in ORCv2
>> to speedup. Additionally, we did a new feature called "speculative
>> compilation" in ORcv2 which yields good results for a set of benchmarks. If
>> you are interested please try this out. We would like to have some
>> benchmarks on your case :)
>> To try things out you can check out the examples directory in LLVM for
>> ExecutionEngine.
>> I hope this helps
>>
>> On Tue, 16 Jun 2020 at 21:10, Terry Guo via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi there,
>>>
>>> I am trying to JIT a rather big wasm bytecode program to x86 native code
>>> and running into this JIT compilation time issue. In the first stage, I use
>>> MCJIT to translate wasm bytecode to a single LLVM IR Module which ends up
>>> with 927 functions. Then it took a pretty long time to apply several
>>> optimization passes to this big IR module and finally generate x86 code.
>>> What should I do to shorten the compilation time? Is it possible to compile
>>> this single big IR Module with MCJIT in parallel? Is OrcV2 JIT faster than
>>> MCJIT? Can the 'concurrent compilation' feature mentioned in Orcv2 webpage
>>> help on this? Thanks in advance for any advice.
>>>
>>> This is how I organized the optimization pass:
>>>
>>>     LLVMAddBasicAliasAnalysisPass(comp_ctx->pass_mgr);
>>>     LLVMAddPromoteMemoryToRegisterPass(comp_ctx->pass_mgr);
>>>     LLVMAddInstructionCombiningPass(comp_ctx->pass_mgr);
>>>     LLVMAddJumpThreadingPass(comp_ctx->pass_mgr);
>>>     LLVMAddConstantPropagationPass(comp_ctx->pass_mgr);
>>>     LLVMAddReassociatePass(comp_ctx->pass_mgr);
>>>     LLVMAddGVNPass(comp_ctx->pass_mgr);
>>>     LLVMAddCFGSimplificationPass(comp_ctx->pass_mgr);
>>>
>>> This is how I apply passes to my single IR module (which actually
>>> includes 927 functions)
>>>
>>> if (comp_ctx->optimize) {
>>>       LLVMInitializeFunctionPassManager(comp_ctx->pass_mgr);
>>>       for (i = 0; i < comp_ctx->func_ctx_count; i++)
>>>           LLVMRunFunctionPassManager(comp_ctx->pass_mgr,
>>>                                      comp_ctx->func_ctxes[i]->func);
>>>   }
>>>
>>> BR,
>>> Terry
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200619/28f09ce5/attachment.html>


More information about the llvm-dev mailing list