[LLVMdev] LLVM 3.3 JIT code speed

Thu Jul 18 12:05:39 PDT 2013

I understand you to mean that you have isolated the actual execution time as your point of comparison, as opposed to including runtime loading and so on.  Is this correct?

One thing that changed between 3.1 and 3.3 is that MCJIT no longer compiles the module during the engine creation process but instead waits until either a function pointer is requested or finalizeObject is called.  I would guess that you have taken that into account in your measurement technique, but it seemed worth mentioning.

What architecture/OS are you testing?

With LLVM 3.3 you can register a JIT event listener (using ExecutionEngine::RegisterJITEventListener) that MCJIT will call with a copy of the actual object image that gets generated.  You could then write that image to a file as a basis for comparing the generated code.  You can find a reference implementation of the interface in lib/ExecutionEngine/IntelJITEvents/IntelJITEventListener.cpp.

-Andy

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Stéphane Letz
Sent: Thursday, July 18, 2013 11:20 AM
To: Eli Friedman
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] LLVM 3.3 JIT code speed

Le 18 juil. 2013 à 19:07, Eli Friedman <eli.friedman at gmail.com> a écrit :

> On Thu, Jul 18, 2013 at 9:07 AM, Stéphane Letz <letz at grame.fr> wrote:
>> Hi,
>> 
>> Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had with LLVM 3.1. What could be the reason?
>> 
>> I tried to play with TargetOptions without any success.
>> 
>> Here is the kind of code we use to allocate the JIT:
>> 
>>    EngineBuilder builder(fResult->fModule);
>>    builder.setOptLevel(CodeGenOpt::Aggressive);
>>    builder.setEngineKind(EngineKind::JIT);
>>    builder.setUseMCJIT(true);
>>    builder.setCodeModel(CodeModel::JITDefault);
>>    builder.setMCPU(llvm::sys::getHostCPUName());
>> 
>>    TargetOptions targetOptions;
>>    targetOptions.NoFramePointerElim = true;
>>    targetOptions.LessPreciseFPMADOption = true;
>>    targetOptions.UnsafeFPMath = true;
>>    targetOptions.NoInfsFPMath = true;
>>    targetOptions.NoNaNsFPMath = true;
>>    targetOptions.GuaranteedTailCallOpt = true;
>> 
>>   builder.setTargetOptions(targetOptions);
>> 
>>    TargetMachine* tm = builder.selectTarget();
>> 
>>    fJIT = builder.create(tm);
>>    if (!fJIT) {
>>        return false;
>>    }
>>    ..
>> 
>> Any idea?
> 
> It's hard to say much without seeing the specific IR and the code 
> generated from that IR.
> 
> -Eli

Our language can do either:

1) DSL  ==> C/C++  ===> clang/gcc ===> exec  code

or

1) DSL  ==> LLVM IR  ===> (optimisation passes) ==>  LLVM  IR  ==> LLVM JIT ==> exex code

1) and 2) where running at same speed with LLVM 3.1, but 2) is now slower with LLVM 3.3 

I compared the LLVM IR that is generated by the 2) chain *after* the optimization passes, with the one that is generated with 1) and clang -emit-llvm -03 with the pure C input. The two are the same. So my conclusion what that the way we are activating the JIT is no more correct in 3.3, or we are missing new steps that have to be done in JIT?

Stéphane Letz

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev