[llvm-dev] gnu X sysv hash performance
Rui Ueyama via llvm-dev
llvm-dev at lists.llvm.org
Fri Dec 1 14:11:52 PST 2017
On Fri, Dec 1, 2017 at 2:07 PM, Brian Cain <brian.cain at gmail.com> wrote:
>
> On Fri, Dec 1, 2017 at 3:55 PM, Rui Ueyama via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Fri, Dec 1, 2017 at 1:26 PM, Rafael Avila de Espindola <
>> rafael.espindola at gmail.com> wrote:
>>
>>>
>>> I got curious how the lld produced gnu hash tables compared to gold. To
>>> test that I timed "perf record ninja check-llvm" (just the lit run) in a
>>> BUILD_SHARED_LIBS build.
>>>
>>> The performance was almost identical, so I decided to try sysv versus
>>> gnu (both produced by lld). The results are interesting:
>>>
>>> % grep -v '^#' perf-gnu/perf.report-by-dso-sym | head
>>> 38.77% ld-2.24.so [.] do_lookup_x
>>> 8.08% ld-2.24.so [.] strcmp
>>> 2.66% ld-2.24.so [.]
>>> _dl_relocate_object
>>> 2.58% ld-2.24.so [.]
>>> _dl_lookup_symbol_x
>>> 1.85% ld-2.24.so [.] _dl_name_match_p
>>> 1.46% [kernel.kallsyms] [k] copy_page
>>> 1.38% ld-2.24.so [.] _dl_map_object
>>> 1.30% [kernel.kallsyms] [k] unmap_page_range
>>> 1.28% [kernel.kallsyms] [k]
>>> filemap_map_pages
>>> 1.26% libLLVMSupport.so.6.0.0svn [.] sstep
>>> % grep -v '^#' perf-sysv/perf.report-by-dso-sym | head
>>> 42.18% ld-2.24.so [.] do_lookup_x
>>> 17.73% ld-2.24.so [.] check_match
>>> 14.41% ld-2.24.so [.] strcmp
>>> 1.22% ld-2.24.so [.]
>>> _dl_relocate_object
>>> 1.13% ld-2.24.so [.]
>>> _dl_lookup_symbol_x
>>> 0.91% ld-2.24.so [.] _dl_name_match_p
>>> 0.67% ld-2.24.so [.] _dl_map_object
>>> 0.65% [kernel.kallsyms] [k] unmap_page_range
>>> 0.63% [kernel.kallsyms] [k] copy_page
>>> 0.59% libLLVMSupport.so.6.0.0svn [.] sstep
>>>
>>> So the gnu hash table helps a lot, but BUILD_SHARED_LIBS is still crazy
>>> inefficient.
>>
>>
>> What is "100%" in these numbers? If 100% means all execution time,
>> ld-2.24.so takes more than 70% of execution time. Is this real?
>>
>>
>>
>
> perf usually measures cycles ("CPU_CLK_UNHALTED" for core/xeon, e.g.).
> So it's not time but cycles. This is a critical distinction when the
> thing being measured has delays/synchronization/disk/network I/O.
>
> Also it looks like this report might be decomposed by some other attribute
> (DSO-at-a-time?) that would affect what "100%" means.
>
> Doing perf on "ninja check-llvm" seems like it would measure cycles
> contributed by lots of non-lld things, in fact it's worth ruling out
> whether it's dominated by non-lld things. Doesn't testing itself perhaps
> spend more cycles than the linking being done here?
>
He is measuring the performance of the dynamic linker/loader to see if
lld-generated dynamic symbol tables and their corresponding .hash or
.gnu.hash tables are efficient. So that is a correct way of testing it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171201/45bf8a36/attachment.html>
More information about the llvm-dev
mailing list