<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">On 02/24/2015 03:31 PM, Diego Novillo

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"

      type="cite">

      <div dir="ltr"><br>

        <div>We (Google) have started to look more closely at the

          profiling infrastructure in LLVM. Internally, we have a large

          dependency on PGO to get peak performance in generated code.</div>

        <div><br>

        </div>

        <div>Some of the dependencies we have on profiling are still not

          present in LLVM (e.g., the inliner) but we will still need to

          incorporate changes to support our work on these

          optimizations. Some of the changes may be addressed as

          individual bug fixes on the existing profiling infrastructure.

          Other changes  may be better implemented as either new

          extensions or as replacements of existing code.</div>

        <div><br>

        </div>

        <div>I think we will try to minimize infrastructure replacement

          at least in the short/medium term. After all, it doesn't make

          too much sense to replace infrastructure that is broken for

          code that doesn't exist yet.</div>

        <div><br>

        </div>

        <div>David Li and I are preparing a document where we describe

          the major issues that we'd like to address. The document is a

          bit on the lengthy side, so it may be easier to start with an

          email discussion. </div>

      </div>

    </blockquote>

    I would personally be interested in seeing a copy of that document,

    but it might be more appropriate for a blog post then a discussion

    on llvm-dev.  I worry that we'd end up with a very unfocused

    discussion.  It might be better to frame this as your plan of attack

    and reserve discussion on llvm-dev for things that are being

    proposed semi near term.  Just my 2 cents.<br>

    <br>

    <blockquote

cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>This is a summary of the main changes we are looking at:</div>

        <div>

          <ol>

            <li>Need to faithfully represent the execution count taken

              from dynamic profiles. Currently, <font face="monospace,

                monospace">MD_prof</font> does not really represent an

              execution count. This makes things like comparing hotness

              across functions hard or impossible. We need a concept of

              global hotness.<br>

            </li>

          </ol>

        </div>

      </div>

    </blockquote>

    What does MD_prof actually represent when used from Clang?  I know

    I've been using it for execution counters in my frontend.  Am I

    approaching that wrong?<br>

    <br>

    As a side comment: I'm a bit leery of the notion of a consistent

    notion of hotness based on counters across functions.  These

    counters are almost always approximate in practice and counting

    problems run rampant.  I'd almost rather see a consistent count

    inferred from data that's assumed to be questionable than make the

    frontend try to generate consistent profiling metadata.  I think

    either approach could be made to work, we just need to think about

    it carefully.  <br>

    <blockquote

cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>

          <ol>

            <li>When the CFG or callgraph change, there need to exist an

              API for incrementally updating/scaling counts. For

              instance, when a function is inlined or partially inlined,

              when the CFG is modified, etc. These counts need to be

              updated incrementally (or perhaps re-computed as a first

              step into that direction).</li>

          </ol>

        </div>

      </div>

    </blockquote>

    Agreed.  Do you have a sense how much of an issue this in practice? 

    I haven't see it kick in much, but it's also not something I've been

    looking for.  <br>

    <blockquote

cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>

          <ol>

            <li>The inliner (and other optimizations) needs to use

              profile information and update it accordingly. This is

              predicated on Chandler's work on the pass manager, of

              course.<br>

            </li>

          </ol>

        </div>

      </div>

    </blockquote>

    Its worth noting that the inliner work can be done independently of

    the pass manager work.  We can always explicitly recompute relevant

    analysis in the inliner if needed.  This will cost compile time, so

    we might need to make this an off by default option.  (Maybe -O3

    only?)  Being able to work on the inliner independently of the pass

    management structure is valuable enough that we should probably

    consider doing this.<br>

    <br>

    PGO inlining is an area I'm very interested in.  I'd really

    encourage you to work incrementally in tree.  I'm likely to start

    putting non-trivial amounts of time into this topic in the next few

    weeks.  I just need to clear a few things off my plate first.  <br>

    <br>

    Other than the inliner, can you list the passes you think are

    profitable to teach about profiling data?  My list so far is: PRE

    (particularly of loads!), the vectorizer (i.e. duplicate work down

    both a hot and cold path when it can be vectorized on the hot path),

    LoopUnswitch, IRCE, & LoopUnroll (avoiding code size explosion

    in cold code).  I'm much more interested in sources of improved

    performance than I am simply code size reduction.  (Reducing code

    size can improve performance of course.)<br>

    <blockquote

cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>

          <ol>

            <li>Need to represent global profile summary data. For

              example, for global hotness determination, it is useful to

              compute additional global summary info, such as a

              histogram of counts that can be used to determine hotness

              and working set size estimates for a large percentage of

              the profiled execution.</li>

          </ol>

        </div>

      </div>

    </blockquote>

    Er, not clear what you're trying to say here?<br>

    <blockquote

cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>

          <div>There are other changes that we will need to incorporate.

            David, Teresa, Chandler, please add anything large that I

            missed.</div>

          <div><br>

          </div>

          <div>My main question at the moment is what would be the best

            way of addressing them. Some seem to require new concepts to

            be implemented (e.g., execution counts). Others could be

            addressed as simple bugs to be fixed in the current

            framework.</div>

        </div>

        <div><br>

        </div>

        <div>Would it make sense to present everything in a unified

          document and discuss that? I've got some reservations about

          that approach because we will end up discussing everything at

          once and it may not lead to concrete progress. Another

          approach would be to present each issue individually either as

          patches or RFCs or bugs.</div>

      </div>

    </blockquote>

    See above.  <br>

    <blockquote

cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><br>

        </div>

        <div>I will be taking on the implementation of several of these

          issues. Some of them involve the SamplePGO harness that I

          added last year. I would also like to know what other bugs or

          problems people have in mind that I could also roll into this

          work.</div>

        <div><br>

        </div>

        <div><br>

        </div>

        <div>Thanks. Diego.<br>

        </div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a class="moz-txt-link-freetext" href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a>

<a class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>