<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html;

      charset=windows-1252">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 6/18/2017 3:51 PM, Vedant Kumar

      wrote:<br class="">

    </div>

    <blockquote type="cite"

      cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">

      <div class="">

        <div>

          <blockquote type="cite" class="">

            <div class="">

              <div class="">My experience:<br class="">

                <br class="">

                1. You have to specify -DLLVM_USE_LINKER=gold (or maybe

                lld works; I didn't try).  If you link with binutils ld,

                the program will generate broken profile information.

                 Apparently, the linked binary is missing the

                __llvm_prf_names section.  This took me half a day to

                figure out.  This issue isn't documented anywhere, and

                the only error message I got was "Assertion

                `!Key.empty()' failed." from llvm-cov.</div>

            </div>

          </blockquote>

          <div><br class="">

          </div>

          <div>I expect llvm-cov to print out "Failed to load coverage:

            <reason>" in this situation. There was some work done

            to tighten up error reporting in ProfileData and its clients

            in r270020. If your host toolchain does have these changes,

            please file a bug, and I'll have it fixed.</div>

        </div>

      </div>

    </blockquote>

    <br>

    Host toolchain is trunk clang... but using system binutils (which is

    2.24 on my Ubuntu 14.04 system... and apparently that's too old per

    David Li's response).  Anyway, filed

    <a class="moz-txt-link-freetext" href="https://bugs.llvm.org/show_bug.cgi?id=33517">https://bugs.llvm.org/show_bug.cgi?id=33517</a> .<br>

    <br>

    <blockquote type="cite"

      cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">

      <div class="">

        <div>

          <div><br class="">

          </div>

          <blockquote type="cite" class="">

            <div class="">

              <div class="">2. The generated binaries are big and slow.

                 Comparing to a build without coverage, llc becomes 8x

                larger overall (text section becomes roughly 2x larger).

                 And check-llvm-codegen-arm goes from 3 seconds to 250

                seconds.<br class="">

              </div>

            </div>

          </blockquote>

          <div><br class="">

          </div>

          <div>The binary size increase comes from coverage mapping

            data, counter increment instrumentation, and profiling

            metadata.</div>

          <div><br class="">

          </div>

          <div>The coverage mapping section is highly compressible, but

            exploiting the compressibility has proven to be tricky. I

            filed: <a href="http://llvm.org/PR33499" class=""

              moz-do-not-send="true">llvm.org/PR33499</a>.</div>

        </div>

      </div>

    </blockquote>

    <br>

    If I'm cross-compiling for a target where the space matters, can I

    rid of the data for the copy on the device using "strip -R

    __llvm_covmap" or something like that, then use llvm-cov on the

    original?<br>

    <br>

    <blockquote type="cite"

      cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">

      <div class="">

        <div>

          <div>Coverage makes use of frontend-based instrumentation,

            which is much less efficient than the IR-based kind. If we

            can find a way to map counters inserted by IR PGO to AST

            nodes, we could improve the situation. I filed: <a

              href="http://llvm.org/PR33500" class=""

              moz-do-not-send="true">llvm.org/PR33500</a>.</div>

        </div>

      </div>

    </blockquote>

    <br>

    This would be nice... but I assume it's hard. :)<br>

    <br>

    <blockquote type="cite"

      cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">

      <div class="">

        <div>

          <div><br class="">

          </div>

          <div>We can reduce testing time by *not* instrumented basic

            tools like count, not, FileCheck etc. I filed: <a

              href="http://llvm.org/PR33501" class=""

              moz-do-not-send="true">llvm.org/PR33501</a>.</div>

          <div><br class="">

          </div>

          <blockquote type="cite" class="">

            <div class="">

              <div class="">3. The generated profile information takes

                up a lot of space: llc generates a 90MB profraw file.<br

                  class="">

              </div>

            </div>

          </blockquote>

          <div><br class="">

          </div>

          <div>I don't have any ideas about how to fix this. You can

            decrease the space overhead for raw profiles by altering <span

              class="">LLVM_PROFILE_</span><span class="">MERGE_P</span><span

              class="">OOL_SIZE from 4 to a lower value.</span></div>

        </div>

      </div>

    </blockquote>

    <br>

    Disk space is cheap, but the I/O takes a long time.  I guess it's

    specifically bad for LLVM's "make check", maybe not so bad for other

    cases.<br>

    <br>

    <blockquote type="cite"

      cite="mid:622B38E5-4A1F-449E-AFF4-BEC1DE0CF2CA@apple.com">

      <div class="">

        <div>

          <blockquote type="cite" class="">

            <div class="">

              <div class="">4. When prepare-code-coverage-artifact.py

                invokes llvm-profdata for the profiles generated by

                "make check", it takes 50GB of memory to process about

                1.5GB of profiles.  Is it supposed to use that much?<br

                  class="">

              </div>

            </div>

          </blockquote>

          <div><br class="">

          </div>

          <div>By default, llvm-profdata uses <span class="">hardware_concurrency()

              to determine the number of threads to use to merge

              profiles. You can change the default by passing

              -j/--num-threads to llvm-profdata. I'm open to changing

              the 'prep' script to use -j4 or something like that.</span></div>

          <br class="">

        </div>

      </div>

    </blockquote>

    <br>

    Oh, so it's using a couple gigabytes per thread multiplied by 24

    cores?  Okay, now I'm not so worried. :)<br>

    <p>-Eli<br>

    </p>

    <pre class="moz-signature" cols="72">-- 

Employee of Qualcomm Innovation Center, Inc.

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>

  </body>

</html>