<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"><br>
      I have started looking at the state of PGO (Profile Guided
      Optimization) in LLVM.</span><b>  </b><span
      style="font-weight:normal;">I want to discuss my high-level plan
      and make sure I'm not missing anything interesting out.  I
      appreciate any feedback on this, pointers to existing work,
      patches and anything related to PGO in LLVM.<br>
      <br>
      I will be keeping changes to this plan in this web document<br>
      <br>
    </span><b style="font-weight:normal;"
      id="docs-internal-guid-5ace4200-3a37-a750-9d7a-eef7650d706d"><a
href="https://docs.google.com/document/d/1b2XFuOkR2K-Oao4u5fR3a9Ok83IB_W4EJWVmNak4GRE/pub">https://docs.google.com/document/d/1b2XFuOkR2K-Oao4u5fR3a9Ok83IB_W4EJWVmNak4GRE/pub</a><br>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">At
          a high-level, I would like the PGO harness to contain the
          following modules:</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:bold;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Profile
          generators</span></p>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;margin-left:
        36pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">These
          modules represent sources of profile.  Mostly, they work by
          instrumenting the user program to make it produce profile
          information.  However, other sources of profile information
          (e.g., samples, hardware counters, static predictors) would be
          supported.</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:bold;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Profile
          Analysis Oracles</span><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span></p>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;margin-left:
        36pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Profile
          information is loaded into the compiler and translated into
          analysis data which the optimizers can use.  These oracles
          become the one and only source of profile information used by
          transformations.  Direct access to the raw profile data
          generated externally is not allowed.</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;margin-left:
        36pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Translation
          from profile information into analysis can be done by adding
          IR metadata or altering compiler internal data structures
          directly.  I prefer IR metadata because it simplifies
          debugging, unit testing and bug reproduction.</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;margin-left:
        36pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Analyses
          should be narrow in the specific type of information they
          provide (e.g., branch probability) and there should not be two
          different analyses that provide overlapping information.  We
          could later provide broader analyses types by aggregating the
          existing ones.</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:bold;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Transformations</span><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span></p>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;margin-left:
        36pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Transformations
          should naturally take advantage of profile information by
          consulting the analyses.  The better information they get from
          the analysis oracles, the better their decisions.</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">My
          plan is to start by making sure that the infrastructure exists
          and provides the basic analyses.</span></p>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">I
          have two primary goals in this first phase:</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <ol style="margin-top:0pt;margin-bottom:0pt;">
        <li dir="ltr"
style="list-style-type:decimal;font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;">
          <p dir="ltr"
            style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Augment
              the PGO infrastructure where required.</span></p>
        </li>
        <li dir="ltr"
style="list-style-type:decimal;font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;">
          <p dir="ltr"
            style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Fix
              existing transformations that are not taking advantage of
              profile data.</span></p>
        </li>
      </ol>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span><br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">In
          evaluating and triaging the existing infrastructure, I will
          use test cases taken from GCC’s own testsuite, a collection of
          Google’s internal applications and any other code base folks
          consider useful.</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">In
          using GCC’s testsuite, my goal is not to mimic how GCC does
          its work, but make sure that the two compilers implement
          functionally equivalent transformations.  That is, make sure
          that LLVM is not leaving optimization opportunities behind.</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">This
          may require implementing missing profile functionality. From a
          brief inspection of the code, most of the major ones seem to
          be there (edge, path, block).  But I don’t know what state
          they are in.</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Some
          of the properties I would like to maintain or add to the
          current framework:</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <ul style="margin-top:0pt;margin-bottom:0pt;">
        <li dir="ltr"
style="list-style-type:disc;font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;">
          <p dir="ltr"
            style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Profile
              data is never accessed directly by analyses and
              transformations.  Rather, it is translated into IR
              metadata</span><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:italic;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">.</span></p>
        </li>
        <li dir="ltr"
style="list-style-type:disc;font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;">
          <p dir="ltr"
            style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Graceful
              degradation in the presence of stale profiles.  Old
              profile data should only result in degraded optimization
              opportunities.  It should neither confuse the compiler nor
              cause erroneous code generation.</span></p>
        </li>
      </ul>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">After
          the basic profile-based transformations are working, I would
          like to add new sources of profile.  Mainly, I am thinking of
          implementing </span><a href="http://gcc.gnu.org/wiki/AutoFDO"
          style="text-decoration:none;"><span
style="font-size:15px;font-family:Arial;color:#1155cc;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;">Auto
            FDO</span></a><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">.
          FDO stands for Feedback Directed Optimization (both PGO and
          FDO tend to be used interchangeably in the GCC community).  In
          this scheme, the compiler does not instrument the code.
           Rather, it uses an external sample collection tool (e.g., </span><a
          href="https://perf.wiki.kernel.org/index.php/Main_Page"
          style="text-decoration:none;"><span
style="font-size:15px;font-family:Arial;color:#1155cc;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;">perf</span></a><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">)
          to collect samples from the program’s execution.  These
          samples are then converted to the format that the instrumented
          program would’ve emitted.</span></p>
      <br>
      <span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></span>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">In
          terms of optimizations, our (Google) experience is that
          inlining is the key beneficiary of profile information.
          Particularly, in big C++ applications. I expect to focus most
          of my attention on the inliner.<br>
        </span></p>
      <p dir="ltr"
        style="line-height:1.15;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:15px;font-family:Arial;color:#000000;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"><br>
          <br>
        </span></p>
    </b>Thanks.  Diego.<br>
  </body>
</html>