[LLVMdev] On LLD performance

Rafael EspĂ­ndola rafael.espindola at gmail.com
Fri Mar 13 11:59:04 PDT 2015


> I will do a run with --merge-strings. This should probably the the
> default to match other ELF linkers.

Trying --merge-strings with today's trunk I got

* comment got 77 797 bytes smaller.
* rodata got 9 394 257 bytes smaller.

Comparing with gold, comment now has the same size and rodata is 55
021 bytes bigger.

Amusingly, merging strings seems to make lld a bit faster. With
today's files I got:

lld:
---------------------------------------------------------------------------

       1985.256427      task-clock (msec)         #    0.999 CPUs
utilized            ( +-  0.07% )
             1,152      context-switches          #    0.580 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               ( +-100.00% )
           199,309      page-faults               #    0.100 M/sec
     5,970,383,833      cycles                    #    3.007 GHz
               ( +-  0.07% )
     3,413,740,580      stalled-cycles-frontend   #   57.18% frontend
cycles idle     ( +-  0.12% )
   <not supported>      stalled-cycles-backend
     6,240,156,987      instructions              #    1.05  insns per
cycle
                                                  #    0.55  stalled
cycles per insn  ( +-  0.01% )
     1,293,186,347      branches                  #  651.395 M/sec
               ( +-  0.01% )
        26,687,288      branch-misses             #    2.06% of all
branches          ( +-  0.00% )

       1.987125976 seconds time elapsed
          ( +-  0.07% )
-----------------------------------------------------------------------------------
ldd --merge-strings:

------------------------------------------------------------------------------
       1912.735291      task-clock (msec)         #    0.999 CPUs
utilized            ( +-  0.10% )
             1,152      context-switches          #    0.602 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               ( +-100.00% )
           187,916      page-faults               #    0.098 M/sec
               ( +-  0.00% )
     5,749,920,058      cycles                    #    3.006 GHz
               ( +-  0.04% )
     3,250,485,516      stalled-cycles-frontend   #   56.53% frontend
cycles idle     ( +-  0.07% )
   <not supported>      stalled-cycles-backend
     5,987,870,976      instructions              #    1.04  insns per
cycle
                                                  #    0.54  stalled
cycles per insn  ( +-  0.00% )
     1,250,773,036      branches                  #  653.919 M/sec
               ( +-  0.00% )
        27,922,489      branch-misses             #    2.23% of all
branches          ( +-  0.00% )

       1.914565005 seconds time elapsed
          ( +-  0.10% )
----------------------------------------------------------------------------


gold

-------------------------------------------------------------------------------
       1000.132594      task-clock (msec)         #    0.999 CPUs
utilized            ( +-  0.01% )
                 0      context-switches          #    0.000 K/sec
                 0      cpu-migrations            #    0.000 K/sec
            77,836      page-faults               #    0.078 M/sec
     3,002,431,314      cycles                    #    3.002 GHz
               ( +-  0.01% )
     1,404,393,569      stalled-cycles-frontend   #   46.78% frontend
cycles idle     ( +-  0.02% )
   <not supported>      stalled-cycles-backend
     4,110,576,101      instructions              #    1.37  insns per
cycle
                                                  #    0.34  stalled
cycles per insn  ( +-  0.00% )
       869,160,761      branches                  #  869.046 M/sec
               ( +-  0.00% )
        15,691,670      branch-misses             #    1.81% of all
branches          ( +-  0.00% )

       1.001044905 seconds time elapsed
          ( +-  0.01% )
-------------------------------------------------------------------------------

I have attached the run.sh script I used to collect the numbers.

Cheers,
Rafael
-------------- next part --------------
A non-text attachment was scrubbed...
Name: run.sh
Type: application/x-sh
Size: 5653 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150313/db3d7301/attachment.sh>


More information about the llvm-dev mailing list