[LLVMdev] On LLD performance
Rafael EspĂndola
rafael.espindola at gmail.com
Thu Mar 19 10:03:03 PDT 2015
> The biggest difference that shows up is that lld has 1,152 context
> switches, but the cpu utilization is still < 1. Maybe there is just a
> threading bug somewhere?
It was a bug, but in my script: I was pining the process to a single
core, a leftover from benchmarking single threaded code. sorry about
that.
With that fixed (new script attached) what I got when linking clang
with a today's build of clang, lld and gold (so not directly
comparable to the old numbers) was:
lld:
2544.422504 task-clock (msec) # 1.837 CPUs
utilized ( +- 0.13% )
69,815 context-switches # 0.027 M/sec
( +- 0.10% )
515 cpu-migrations # 0.203 K/sec
( +- 2.38% )
188,589 page-faults # 0.074 M/sec
( +- 0.00% )
7,518,975,892 cycles # 2.955 GHz
( +- 0.12% )
4,848,133,388 stalled-cycles-frontend # 64.48% frontend
cycles idle ( +- 0.18% )
<not supported> stalled-cycles-backend
6,383,065,952 instructions # 0.85 insns per
cycle
# 0.76 stalled
cycles per insn ( +- 0.06% )
1,331,202,027 branches # 523.184 M/sec
( +- 0.07% )
30,053,442 branch-misses # 2.26% of all
branches ( +- 0.04% )
1.385020712 seconds time elapsed
( +- 0.15% )
gold:
918.859273 task-clock (msec) # 0.999 CPUs
utilized ( +- 0.01% )
0 context-switches # 0.000 K/sec
( +- 41.02% )
0 cpu-migrations # 0.000 K/sec
76,393 page-faults # 0.083 M/sec
2,758,439,113 cycles # 3.002 GHz
( +- 0.01% )
1,357,937,367 stalled-cycles-frontend # 49.23% frontend
cycles idle ( +- 0.02% )
<not supported> stalled-cycles-backend
3,881,565,068 instructions # 1.41 insns per
cycle
# 0.35 stalled
cycles per insn ( +- 0.00% )
784,078,474 branches # 853.317 M/sec
( +- 0.00% )
13,984,077 branch-misses # 1.78% of all
branches ( +- 0.01% )
0.919717669 seconds time elapsed
( +- 0.01% )
gold --threads
1300.210314 task-clock (msec) # 1.523 CPUs
utilized ( +- 0.15% )
26,913 context-switches # 0.021 M/sec
( +- 0.34% )
7,910 cpu-migrations # 0.006 M/sec
( +- 0.63% )
83,507 page-faults # 0.064 M/sec
( +- 0.02% )
3,842,183,459 cycles # 2.955 GHz
( +- 0.14% )
2,273,634,375 stalled-cycles-frontend # 59.18% frontend
cycles idle ( +- 0.20% )
<not supported> stalled-cycles-backend
4,058,907,107 instructions # 1.06 insns per
cycle
# 0.56 stalled
cycles per insn ( +- 0.07% )
838,791,061 branches # 645.120 M/sec
( +- 0.06% )
14,220,563 branch-misses # 1.70% of all
branches ( +- 0.02% )
0.853835099 seconds time elapsed
( +- 0.25% )
So much better!
The machine has 12 cores, each with 2 hardware threads.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: run.sh
Type: application/x-sh
Size: 5352 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150319/5f2c7cae/attachment.sh>
More information about the llvm-dev
mailing list