[LLVMdev] question about -coverage

Fri Oct 4 01:40:23 PDT 2013

Another question is about the performance of coverage's at-exit actions
(dumping coverage data on disk).
I've built chromium's base_unittests with -fprofile-arcs -ftest-coverage
and the coverage's at-exit hook takes 22 seconds,
which is 44x more than I am willing to pay.
Most of the time is spent here:
#0  0x00007ffff3b034cd in msync () at ../sysdeps/unix/syscall-template.S:82
#1  0x0000000003a8c818 in llvm_gcda_end_file ()
#2  0x0000000003a8c914 in llvm_writeout_files ()
#3  0x00007ffff2f5e901 in __run_exit_handlers
The test depends on ~700 source files and so the profiling library calls
msync ~700 times.
Full chromium depends on ~12000 source files, so we'll be dumping the
coverage data for 5 minutes this way.
I understand that we have to support the lcov/gcov format (broken in may
ways) and this may be the reason for being slow.
But I really need something much faster (and maybe simpler).

Is anyone planing any work on coverage in the nearest months?
If no, we'll probably cook something simple and gcov-independent.
Thoughts?

--kcc

On Thu, Oct 3, 2013 at 6:47 PM, Kostya Serebryany <kcc at google.com> wrote:

> Hello,
>
> I have few questions about coverage.
>
> Is there any user-facing documentation for clang's "-coverage" flag?
> The coverage instrumentation seems to happen before asan, and so if asan
> is also enabled
> asan will instrument accesses to @__llvm_gcov_ctr.
> This is undesirable and so we'd like to skip these accesses.
> Looks like GEP around @__llvm_gcov_ctr have special metadata attached:
>   %2 = getelementptr inbounds [4 x i64]* @__llvm_gcov_ctr, i64 0, i64 %1
>   %3 = load i64* %2, align 8
>   %4 = add i64 %3, 1
>   store i64 %4, i64* %2, align 8
>   ...
> !1 = metadata !{...; [ DW_TAG_compile_unit ] ... /home/kcc/tmp/cond.cc]
> [DW_LANG_C_plus_plus]
>
> Can we rely on having this metadata attached to @__llvm_gcov_ctr?
> Should we attach some metadata to the actual accesses as well, or simply
> find the corresponding GEP?
>
> Finally, does anyone have performance numbers for coverage?
> As of today it seems completely thread-hostile since __llvm_gcov_ctr is
> not thread-local.
> A simple stress test shows that coverage slows down by 50x!
> % cat ~/tmp/coverage_mt.cc
> #include <pthread.h>
> __thread int x;
> __attribute__((noinline))
> void foo() {
>   x++;
> }
>
> void *Thread(void *) {
>   for (int i = 0; i < 100000000; i++)
>     foo();
>   return 0;
> }
>
> int main() {
>   static const int kNumThreads = 16;
>   pthread_t t[kNumThreads];
>   for (int i = 0; i < kNumThreads; i++)
>     pthread_create(&t[i], 0, Thread, 0);
>   for (int i = 0; i < kNumThreads; i++)
>     pthread_join(t[i], 0);
>   return 0;
> }
>
> % clang -O2 ~/tmp/coverage_mt.cc -lpthread  ; time ./a.out
> TIME: real: 0.284; user: 3.560; system: 0.000
> % clang -O2 ~/tmp/coverage_mt.cc -lpthread -coverage  ; time ./a.out
> TIME: real: 13.327; user: 174.510; system: 0.000
>
> Any principal objections against making __llvm_gcov_ctr thread-local,
> perhaps under a flag?
>
> If anyone is curious, my intent is to enable running coverage and asan in
> one process.
>
> Thanks,
> --kcc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131004/0b024294/attachment.html>