[LLVMdev] On LLD performance

Eric Christopher echristo at gmail.com
Fri Mar 13 12:38:31 PDT 2015


On Fri, Mar 13, 2015 at 12:14 PM Rafael EspĂ­ndola <
rafael.espindola at gmail.com> wrote:

> > I will do a run with --merge-strings. This should probably the the
> > default to match other ELF linkers.
>
> Trying --merge-strings with today's trunk I got
>
> * comment got 77 797 bytes smaller.
> * rodata got 9 394 257 bytes smaller.
>
> Comparing with gold, comment now has the same size and rodata is 55
> 021 bytes bigger.
>
> Amusingly, merging strings seems to make lld a bit faster. With
>

As a side note this is a really good thing. The idea is that the linker
should largely be I/O bound and not processor bound.

-eric


> today's files I got:
>
> lld:
> ------------------------------------------------------------
> ---------------
>
>        1985.256427      task-clock (msec)         #    0.999 CPUs
> utilized            ( +-  0.07% )
>              1,152      context-switches          #    0.580 K/sec
>                  0      cpu-migrations            #    0.000 K/sec
>                ( +-100.00% )
>            199,309      page-faults               #    0.100 M/sec
>      5,970,383,833      cycles                    #    3.007 GHz
>                ( +-  0.07% )
>      3,413,740,580      stalled-cycles-frontend   #   57.18% frontend
> cycles idle     ( +-  0.12% )
>    <not supported>      stalled-cycles-backend
>      6,240,156,987      instructions              #    1.05  insns per
> cycle
>                                                   #    0.55  stalled
> cycles per insn  ( +-  0.01% )
>      1,293,186,347      branches                  #  651.395 M/sec
>                ( +-  0.01% )
>         26,687,288      branch-misses             #    2.06% of all
> branches          ( +-  0.00% )
>
>        1.987125976 seconds time elapsed
>           ( +-  0.07% )
> ------------------------------------------------------------
> -----------------------
> ldd --merge-strings:
>
> ------------------------------------------------------------
> ------------------
>        1912.735291      task-clock (msec)         #    0.999 CPUs
> utilized            ( +-  0.10% )
>              1,152      context-switches          #    0.602 K/sec
>                  0      cpu-migrations            #    0.000 K/sec
>                ( +-100.00% )
>            187,916      page-faults               #    0.098 M/sec
>                ( +-  0.00% )
>      5,749,920,058      cycles                    #    3.006 GHz
>                ( +-  0.04% )
>      3,250,485,516      stalled-cycles-frontend   #   56.53% frontend
> cycles idle     ( +-  0.07% )
>    <not supported>      stalled-cycles-backend
>      5,987,870,976      instructions              #    1.04  insns per
> cycle
>                                                   #    0.54  stalled
> cycles per insn  ( +-  0.00% )
>      1,250,773,036      branches                  #  653.919 M/sec
>                ( +-  0.00% )
>         27,922,489      branch-misses             #    2.23% of all
> branches          ( +-  0.00% )
>
>        1.914565005 seconds time elapsed
>           ( +-  0.10% )
> ------------------------------------------------------------
> ----------------
>
>
> gold
>
> ------------------------------------------------------------
> -------------------
>        1000.132594      task-clock (msec)         #    0.999 CPUs
> utilized            ( +-  0.01% )
>                  0      context-switches          #    0.000 K/sec
>                  0      cpu-migrations            #    0.000 K/sec
>             77,836      page-faults               #    0.078 M/sec
>      3,002,431,314      cycles                    #    3.002 GHz
>                ( +-  0.01% )
>      1,404,393,569      stalled-cycles-frontend   #   46.78% frontend
> cycles idle     ( +-  0.02% )
>    <not supported>      stalled-cycles-backend
>      4,110,576,101      instructions              #    1.37  insns per
> cycle
>                                                   #    0.34  stalled
> cycles per insn  ( +-  0.00% )
>        869,160,761      branches                  #  869.046 M/sec
>                ( +-  0.00% )
>         15,691,670      branch-misses             #    1.81% of all
> branches          ( +-  0.00% )
>
>        1.001044905 seconds time elapsed
>           ( +-  0.01% )
> ------------------------------------------------------------
> -------------------
>
> I have attached the run.sh script I used to collect the numbers.
>
> Cheers,
> Rafael
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150313/4b0756fd/attachment.html>


More information about the llvm-dev mailing list