[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

Mon Jun 19 21:08:10 PDT 2017

On Mon, Jun 19, 2017 at 7:36 PM, Vedant Kumar <vsk at apple.com> wrote:

>
> On Jun 19, 2017, at 7:29 PM, Vedant Kumar <vsk at apple.com> wrote:
>
>
> On Jun 19, 2017, at 4:32 PM, Friedman, Eli <efriedma at codeaurora.org>
> wrote:
>
> On 6/18/2017 3:51 PM, Vedant Kumar wrote:
>
> My experience:
>
> 1. You have to specify -DLLVM_USE_LINKER=gold (or maybe lld works; I
> didn't try).  If you link with binutils ld, the program will generate
> broken profile information.  Apparently, the linked binary is missing the
> __llvm_prf_names section.  This took me half a day to figure out.  This
> issue isn't documented anywhere, and the only error message I got was
> "Assertion `!Key.empty()' failed." from llvm-cov.
>
>
> I expect llvm-cov to print out "Failed to load coverage: <reason>" in this
> situation. There was some work done to tighten up error reporting in
> ProfileData and its clients in r270020. If your host toolchain does have
> these changes, please file a bug, and I'll have it fixed.
>
>
> Host toolchain is trunk clang... but using system binutils (which is 2.24
> on my Ubuntu 14.04 system... and apparently that's too old per David Li's
> response).  Anyway, filed https://bugs.llvm.org/show_bug.cgi?id=33517 .
>
>
> I've updated the clang docs re: 'Source based code coverage' to reflect
> this issue. I've also tightened up our error reporting a bit so we fail
> earlier with something better than an assertion message (r305765,
> r305767).
>
> 2. The generated binaries are big and slow.  Comparing to a build without
> coverage, llc becomes 8x larger overall (text section becomes roughly 2x
> larger).  And check-llvm-codegen-arm goes from 3 seconds to 250 seconds.
>
>
> The binary size increase comes from coverage mapping data, counter
> increment instrumentation, and profiling metadata.
>
> The coverage mapping section is highly compressible, but exploiting the
> compressibility has proven to be tricky. I filed: llvm.org/PR33499.
>
>
> If I'm cross-compiling for a target where the space matters, can I rid of
> the data for the copy on the device using "strip -R __llvm_covmap" or
> something like that, then use llvm-cov on the original?
>
>
> I haven't tried this but I expect it to work. Instrumented programs don't
> reference the __llvm_covmap section.
>
>

Right. The user can also use objcopy -only-section=__llvm_covmap <in> <out>
to copy the covmap section into a smaller file, and feed that later to the
coverage tool.

David

> Coverage makes use of frontend-based instrumentation, which is much less
> efficient than the IR-based kind. If we can find a way to map counters
> inserted by IR PGO to AST nodes, we could improve the situation. I filed:
> llvm.org/PR33500.
>
>
> This would be nice... but I assume it's hard. :)
>
>
> It seems like it is. At a high level, you'd need some way to associate the
> counters placed by IR PGO instrumentation to the counters that clang
> expects to see while walking an AST. I don't have a concrete design for
> this in mind.
>
> We can reduce testing time by *not* instrumented basic tools like count,
> not, FileCheck etc. I filed: llvm.org/PR33501.
>
> 3. The generated profile information takes up a lot of space: llc
> generates a 90MB profraw file.
>
>
> I don't have any ideas about how to fix this. You can decrease the space
> overhead for raw profiles by altering LLVM_PROFILE_MERGE_POOL_SIZE from 4
> to a lower value.
>
>
> Disk space is cheap, but the I/O takes a long time.  I guess it's
> specifically bad for LLVM's "make check", maybe not so bad for other cases.
>
>
> You can speed up "make check" a bit by using non-instrumented versions of
> count, not, FileCheck, etc.
>
>
> Ah, sorry for mentioning this twice.
>
> On another note, I'm looking into the "N mismatched functions" warnings
> issue, and suspect that it happens when there are conflicting definitions
> of the same function in different binaries. The issue doesn't seem to occur
> when using profiles from just one binary to generate a report for that
> binary. I'll dig into this a bit more and update PR33502.
>
> vedant
>
>
> vedant
>
> 4. When prepare-code-coverage-artifact.py invokes llvm-profdata for the
> profiles generated by "make check", it takes 50GB of memory to process
> about 1.5GB of profiles.  Is it supposed to use that much?
>
>
> By default, llvm-profdata uses hardware_concurrency() to determine the
> number of threads to use to merge profiles. You can change the default by
> passing -j/--num-threads to llvm-profdata. I'm open to changing the 'prep'
> script to use -j4 or something like that.
>
>
> Oh, so it's using a couple gigabytes per thread multiplied by 24 cores?
> Okay, now I'm not so worried. :)
>
> -Eli
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/7b7cad17/attachment.html>