[LLVMdev] Very slow performance of lli on x86

Sun Nov 15 22:44:50 PST 2009

Hi all,

I have attached the complete test suite. it has different directories for
gcc, llvm-gcc , clang and lli-clang. Source code , makefile and run script
(contains number of times the program should execute) for each case are
available inside each directory.

*
FOLLOWING ARE THE STATISTICS WHILE USING LLI FOR SINGLE ITERATION*

===-------------------------------------------------------------------------===
                          ... Statistics Collected ...
===-------------------------------------------------------------------------===

   58 dagcombine       - Number of dag nodes combined
16384 jit              - Number of bytes of global vars initialized
  357 jit              - Number of bytes of machine code compiled
    2 jit              - Number of global vars initialized
   27 jit              - Number of relocations applied
    3 jit              - Number of slabs of memory allocated by the JIT
  105 liveintervals    - Number of original intervals
   21 loop-reduce      - Number of IV uses strength reduced
    4 loop-reduce      - Number of PHIs inserted
    2 loop-reduce      - Number of loop terminating conds optimized
    1 machine-licm     - Number of machine instructions hoisted out of loops
    4 phielim          - Number of atomic phis lowered
    2 regalloc         - Number of copies coalesced
   27 regalloc         - Number of iterations performed
    3 regcoalescing    - Number of cross class joins performed
   44 regcoalescing    - Number of identity moves eliminated after
coalescing
    1 regcoalescing    - Number of instructions re-materialized
   40 regcoalescing    - Number of interval joins performed
    2 scalar-evolution - Number of loops with predictable loop counts
    4 twoaddrinstr     - Number of instructions aggressively commuted
    6 twoaddrinstr     - Number of instructions commuted to coalesce
    3 twoaddrinstr     - Number of instructions re-materialized
   23 twoaddrinstr     - Number of two-address instructions
    2 virtregrewriter  - Number of copies elided
    1 x86-codegen      - Number of floating point instructions
   84 x86-emitter      - Number of machine instructions emitted

real    0m0.043s
user    0m0.027s
sys    0m0.010s

*FOLLOWING ARE THE STATISTICS WHILE FORCING LLI TO USE INTERPRETER FOR
SINGLE ITERATION*

===-------------------------------------------------------------------------===
                          ... Statistics Collected ...
===-------------------------------------------------------------------------===

147495 interpreter - Number of dynamic instructions executed
 17735 jit         - Number of bytes of global vars initialized
    49 jit         - Number of global vars initialized

real    0m0.083s
user    0m0.078s
sys    0m0.003s

Even for single iteration the time take for execution is pretty high when
compared to gcc, llvm-gcc and clang.
What should be the expected behavior while using lli? As per my
understanding as lli does runtime optimizations it should be faster than
clang and llvm-gcc. am i right?

*My machine details are*
*Linux localhost.localdomain 2.6.25-14.fc9.i686 #1 SMP Thu May 1 06:28:41
EDT 2008 i686 i686 i386 GNU/Linux*
*Memory : 1GB DDR2
CPU: Intel Pentium Dual-core @ 2.00 GHz*

Please let me know how can i proceed with this test.

Thanks and Regards,
Prasanth J

On Mon, Nov 16, 2009 at 1:06 AM, Eric Christopher <echristo at apple.com>wrote:

>
> On Nov 14, 2009, at 11:52 PM, Prasanth J wrote:
>
> > step 4:
> > running monolith.bc for 10000 iterations using lli tool and measured the
> time.
>
> How are you doing this?
>
> -eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20091116/918a9562/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: generic_asm.tgz
Type: application/x-gzip
Size: 62726 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20091116/918a9562/attachment.bin>