<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 6/18/2017 3:51 PM, Vedant Kumar
wrote:<br class="">
</div>
<blockquote type="cite"
cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">
<div class="">
<div>
<blockquote type="cite" class="">
<div class="">
<div class="">My experience:<br class="">
<br class="">
1. You have to specify -DLLVM_USE_LINKER=gold (or maybe
lld works; I didn't try). If you link with binutils ld,
the program will generate broken profile information.
Apparently, the linked binary is missing the
__llvm_prf_names section. This took me half a day to
figure out. This issue isn't documented anywhere, and
the only error message I got was "Assertion
`!Key.empty()' failed." from llvm-cov.</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>I expect llvm-cov to print out "Failed to load coverage:
<reason>" in this situation. There was some work done
to tighten up error reporting in ProfileData and its clients
in r270020. If your host toolchain does have these changes,
please file a bug, and I'll have it fixed.</div>
</div>
</div>
</blockquote>
<br>
Host toolchain is trunk clang... but using system binutils (which is
2.24 on my Ubuntu 14.04 system... and apparently that's too old per
David Li's response). Anyway, filed
<a class="moz-txt-link-freetext" href="https://bugs.llvm.org/show_bug.cgi?id=33517">https://bugs.llvm.org/show_bug.cgi?id=33517</a> .<br>
<br>
<blockquote type="cite"
cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">
<div class="">
<div>
<div><br class="">
</div>
<blockquote type="cite" class="">
<div class="">
<div class="">2. The generated binaries are big and slow.
Comparing to a build without coverage, llc becomes 8x
larger overall (text section becomes roughly 2x larger).
And check-llvm-codegen-arm goes from 3 seconds to 250
seconds.<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>The binary size increase comes from coverage mapping
data, counter increment instrumentation, and profiling
metadata.</div>
<div><br class="">
</div>
<div>The coverage mapping section is highly compressible, but
exploiting the compressibility has proven to be tricky. I
filed: <a href="http://llvm.org/PR33499" class=""
moz-do-not-send="true">llvm.org/PR33499</a>.</div>
</div>
</div>
</blockquote>
<br>
If I'm cross-compiling for a target where the space matters, can I
rid of the data for the copy on the device using "strip -R
__llvm_covmap" or something like that, then use llvm-cov on the
original?<br>
<br>
<blockquote type="cite"
cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">
<div class="">
<div>
<div>Coverage makes use of frontend-based instrumentation,
which is much less efficient than the IR-based kind. If we
can find a way to map counters inserted by IR PGO to AST
nodes, we could improve the situation. I filed: <a
href="http://llvm.org/PR33500" class=""
moz-do-not-send="true">llvm.org/PR33500</a>.</div>
</div>
</div>
</blockquote>
<br>
This would be nice... but I assume it's hard. :)<br>
<br>
<blockquote type="cite"
cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">
<div class="">
<div>
<div><br class="">
</div>
<div>We can reduce testing time by *not* instrumented basic
tools like count, not, FileCheck etc. I filed: <a
href="http://llvm.org/PR33501" class=""
moz-do-not-send="true">llvm.org/PR33501</a>.</div>
<div><br class="">
</div>
<blockquote type="cite" class="">
<div class="">
<div class="">3. The generated profile information takes
up a lot of space: llc generates a 90MB profraw file.<br
class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>I don't have any ideas about how to fix this. You can
decrease the space overhead for raw profiles by altering <span
class="">LLVM_PROFILE_</span><span class="">MERGE_P</span><span
class="">OOL_SIZE from 4 to a lower value.</span></div>
</div>
</div>
</blockquote>
<br>
Disk space is cheap, but the I/O takes a long time. I guess it's
specifically bad for LLVM's "make check", maybe not so bad for other
cases.<br>
<br>
<blockquote type="cite"
cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">
<div class="">
<div>
<blockquote type="cite" class="">
<div class="">
<div class="">4. When prepare-code-coverage-artifact.py
invokes llvm-profdata for the profiles generated by
"make check", it takes 50GB of memory to process about
1.5GB of profiles. Is it supposed to use that much?<br
class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>By default, llvm-profdata uses <span class="">hardware_concurrency()
to determine the number of threads to use to merge
profiles. You can change the default by passing
-j/--num-threads to llvm-profdata. I'm open to changing
the 'prep' script to use -j4 or something like that.</span></div>
<br class="">
</div>
</div>
</blockquote>
<br>
Oh, so it's using a couple gigabytes per thread multiplied by 24
cores? Okay, now I'm not so worried. :)<br>
<p>-Eli<br>
</p>
<pre class="moz-signature" cols="72">--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
</body>
</html>