[LLVMdev] Slow jitter.

Wed Aug 26 06:57:42 PDT 2009

Eli Friedman <eli.friedman at gmail.com> writes:

> On Tue, Aug 25, 2009 at 4:58 PM, Óscar Fuentes<ofv at wanadoo.es> wrote:
>> Eli Friedman <eli.friedman at gmail.com> writes:
>>
>>> On Wed, Aug 26, 2009 at 1:10 AM, Óscar Fuentes<ofv at wanadoo.es> wrote:
>>>> While compiling some sources, translating from my compiler's IR to LLVM
>>>> using the C++ API requires 2.5 seconds. If the resulting LLVM module is
>>>> dumped as LLVM assembler, the file is 240,000 lines long. Generating
>>>> LLVM code is fast.
>>>>
>>>> However, generating the native code is quite slow: 33 seconds. I force
>>>> native code generation calling ExecutionEngine::getPointerToFunction for
>>>> each function on the module.
>>>>
>>>> This is on x86/Windows/MinGW. The only pass is TargetData, so no fancy
>>>> optimizations.
>>>>
>>>> I don't think that a static compiler (llvm-gcc, for instance) needs so
>>>> much time for generating unoptimized native code for a similarly sized
>>>> module. Is there something special about the JIT that makes it so slow?
>>>
>>> For comparison, how long does it take to write the whole thing out as
>>> native assembler?
>>
>> What kind of metric this is? How string manipulation and I/O are a
>> better indication than the number of llvm assembly lines generated or
>> the ratio (llvm IR generation time / native code generation time)?
>
> I wanted the comparison to check whether the issue is just "codegen is
> slow", or more specifically that JIT codegen is slow.  You seem to be
> under the impression that it will be significantly slower, but I don't
> think it's self-evident.  (The output of "time llc dumpedmodule.bc"
> would be sufficient.)

Sorry Eli. I misread your message as if you were suggesting to measure
the time required for dumping the module as LLVM assembler.

llc needs 45 seconds. This is far worse than the 33 seconds used by the
JIT. Maybe llc is using optimizations. My JIT have no optimizations
enabled.

Yup, llc -O0 takes 37.5 seconds.

llc -pre-RA-sched=fast -regalloc=local takes 26 seconds. Much better but
still slow IMO. The question is if this avoids the non-linear algorithms
and if the generated code is faster enough to justify LLVM. I'll do some
experimentation.

The generated assembly file is 290K lines for unadorned llc and 616K
lines for -pre-RA-sched=fast -regalloc=local. This does not inspire much
hope :-)

-- 
Óscar