[LLVMdev] On LLD performance
Rafael EspĂndola
rafael.espindola at gmail.com
Fri Mar 13 11:59:04 PDT 2015
> I will do a run with --merge-strings. This should probably the the
> default to match other ELF linkers.
Trying --merge-strings with today's trunk I got
* comment got 77 797 bytes smaller.
* rodata got 9 394 257 bytes smaller.
Comparing with gold, comment now has the same size and rodata is 55
021 bytes bigger.
Amusingly, merging strings seems to make lld a bit faster. With
today's files I got:
lld:
---------------------------------------------------------------------------
1985.256427 task-clock (msec) # 0.999 CPUs
utilized ( +- 0.07% )
1,152 context-switches # 0.580 K/sec
0 cpu-migrations # 0.000 K/sec
( +-100.00% )
199,309 page-faults # 0.100 M/sec
5,970,383,833 cycles # 3.007 GHz
( +- 0.07% )
3,413,740,580 stalled-cycles-frontend # 57.18% frontend
cycles idle ( +- 0.12% )
<not supported> stalled-cycles-backend
6,240,156,987 instructions # 1.05 insns per
cycle
# 0.55 stalled
cycles per insn ( +- 0.01% )
1,293,186,347 branches # 651.395 M/sec
( +- 0.01% )
26,687,288 branch-misses # 2.06% of all
branches ( +- 0.00% )
1.987125976 seconds time elapsed
( +- 0.07% )
-----------------------------------------------------------------------------------
ldd --merge-strings:
------------------------------------------------------------------------------
1912.735291 task-clock (msec) # 0.999 CPUs
utilized ( +- 0.10% )
1,152 context-switches # 0.602 K/sec
0 cpu-migrations # 0.000 K/sec
( +-100.00% )
187,916 page-faults # 0.098 M/sec
( +- 0.00% )
5,749,920,058 cycles # 3.006 GHz
( +- 0.04% )
3,250,485,516 stalled-cycles-frontend # 56.53% frontend
cycles idle ( +- 0.07% )
<not supported> stalled-cycles-backend
5,987,870,976 instructions # 1.04 insns per
cycle
# 0.54 stalled
cycles per insn ( +- 0.00% )
1,250,773,036 branches # 653.919 M/sec
( +- 0.00% )
27,922,489 branch-misses # 2.23% of all
branches ( +- 0.00% )
1.914565005 seconds time elapsed
( +- 0.10% )
----------------------------------------------------------------------------
gold
-------------------------------------------------------------------------------
1000.132594 task-clock (msec) # 0.999 CPUs
utilized ( +- 0.01% )
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
77,836 page-faults # 0.078 M/sec
3,002,431,314 cycles # 3.002 GHz
( +- 0.01% )
1,404,393,569 stalled-cycles-frontend # 46.78% frontend
cycles idle ( +- 0.02% )
<not supported> stalled-cycles-backend
4,110,576,101 instructions # 1.37 insns per
cycle
# 0.34 stalled
cycles per insn ( +- 0.00% )
869,160,761 branches # 869.046 M/sec
( +- 0.00% )
15,691,670 branch-misses # 1.81% of all
branches ( +- 0.00% )
1.001044905 seconds time elapsed
( +- 0.01% )
-------------------------------------------------------------------------------
I have attached the run.sh script I used to collect the numbers.
Cheers,
Rafael
-------------- next part --------------
A non-text attachment was scrubbed...
Name: run.sh
Type: application/x-sh
Size: 5653 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150313/db3d7301/attachment.sh>
More information about the llvm-dev
mailing list