<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hello David, thanks for detailed response!</p>
    <p>Do you have any tests that you use to measure the PGO
      effectiveness? I have tested clang version 6.0 with the same
      sample that Jie Chen used in 2016 and actually both frontend-based
      PGO and IR-based make code run slower, see the average time:</p>
    <p>clang++ -O3: 3.15 sec </p>
    <p>clang++ -O3 and -fprofile-instr-use: 3.160 sec<br>
    </p>
    <p>clang++ -O3 and -fprofile-use: 3.180 sec<br>
    </p>
    <p>g++ (7.3.0) -O3: 3.640 sec<br>
    </p>
    <p>g++ (7.3.0) -O3 and -fprofile-use: 2.92 sec</p>
    <p>Do you have any idea what can be wrong? Maybe there are some
      recommendations in which cases one should use PGO with clang and
      when it is better not to do it?</p>
    <p>Thanks!<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 02/05/2018 09:38 AM, Xinliang David
      Li wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CALRgJCPgrzj7Z=HDj2k-A+TU63Y3hKnvCfk96RxZ895LKSKyRg@mail.gmail.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="ltr"><br>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On Sun, Feb 4, 2018 at 9:59 PM,
            Victor Leschuk <span dir="ltr"><<a
                href="mailto:vleschuk@accesssoftek.com" target="_blank"
                moz-do-not-send="true">vleschuk@accesssoftek.com</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello
              David!<br>
              <br>
              I have recently started acquaintance with PGO in
              LLVM/clang and found<br>
              your e-mail thread:<br>
              <a
                href="http://lists.llvm.org/pipermail/llvm-dev/2016-May/099395.html"
                rel="noreferrer" target="_blank" moz-do-not-send="true">http://lists.llvm.org/<wbr>pipermail/llvm-dev/2016-May/<wbr>099395.html</a>
              . Here you<br>
              posted a nice list of optimizations that use profiling and
              of those<br>
              which could be using but don't. However that thread is
              about 2 years<br>
              old. Could you please kindly let me know if there were any
              significant<br>
              changes in this area since that time?<br>
            </blockquote>
            <div><br>
            </div>
            <div><br>
            </div>
            <div>Yes, there were quite some changes since then. Here are
              some of the new features:</div>
            <div><br>
            </div>
            <div>* LLVM IR based PGO -- this is designed to maximize
              program performance. The option to turn it on is
              -fprofile-generate/-fprofile-use</div>
            <div>* value profiling support in PGO -- currently support
              indirect call target profiling and memcpy/memset size
              profiling and optimizations</div>
            <div>* Profile data is made available for inliner to use
              (enabled only for the new pass manager:
              -fexperimental-new-pass-manager)</div>
            <div>* Profile aware LICM is available -- implemented via a
              profile driven code sinking pass </div>
            <div>* Partial inlining is made profile aware;  Graham Yu
              also added support for multiple region function outlining
              (with PGO)</div>
            <div>* BB layout heuristics are tuned with PGO</div>
            <div>* hotness driven function layout optimization </div>
            <div><br>
            </div>
            <div>There are pending work in the following area:</div>
            <div>* profile aware loop vectorization, etc</div>
            <div>* control heigh reduction optimization (Hiroshi is
              working on this)</div>
            <div><br>
            </div>
            <div>ThinLTO also works well with PGO.</div>
            <div><br>
            </div>
            <div>Hope this helps.</div>
            <div><br>
            </div>
            <div>David</div>
            <div><br>
            </div>
            <div>
              <pre style="white-space:pre-wrap;color:rgb(0,0,0);font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial">><i> What I can tell you is that there are many missing ones (that can benefit
</i>from profile): such as profile aware LICM (patch pending), speculative PRE,
loop unrolling, loop peeling, auto vectorization, inlining, function
splitting, function layout, function outlinling,  profile driven size
optimization, induction variable optimization/strength reduction, stringOp
specialization/optimization/inlining, switch peeling/lowering etc. The
biggest profile user today include ralloc, BB layout, ifcvt, shrinkwrapping
etc, but there should be rooms to be improvement there too.</pre>
              <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <br>
              Thanks in advance!<br>
              <span class="HOEnZb"><font color="#888888"><br>
                  --<br>
                  Best Regards,<br>
                  <br>
                  Victor Leschuk | Software Engineer | Access Softek<br>
                  <br>
                </font></span></blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Best Regards,

Victor Leschuk | Software Engineer | Access Softek</pre>
  </body>
</html>