<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <div class="moz-cite-prefix">On 04/13/2015 11:20 AM, Xinliang David
      Li wrote:<br>
    </div>
    <blockquote
cite="mid:CALRgJCPdy0vLKD2XS2DpTFQVRhKJsBk_QBW0KZpSrf0GA1qO5Q@mail.gmail.com"
      type="cite">
      <div dir="ltr">Indirect call promotion + inlining does have very
        large performance benefit. IIRC, it improves eon, h264ref,
        povray, gap, ......etc. Many requires cross module inlining too.
        For instance povray improves >20% (if my memory does not
        betray me).  Our internal programs all show large improvement
        with it.</div>
    </blockquote>
    I am not questioning the benefit of profile guided devirtualization
    and inlining.  Getting these optimizations implemented is a clear
    win if done right.  <br>
    <br>
    I am seeking clarification on the approach being taken to implement
    them and the design choices being proposed.  That's a different
    matter.  <br>
    <blockquote
cite="mid:CALRgJCPdy0vLKD2XS2DpTFQVRhKJsBk_QBW0KZpSrf0GA1qO5Q@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>David</div>
        <div><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Mon, Apr 13, 2015 at 11:02 AM,
          Philip Reames <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div class="HOEnZb">
              <div class="h5"><br>
                On 04/13/2015 10:57 AM, <a moz-do-not-send="true"
                  href="mailto:betulb@codeaurora.org" target="_blank">betulb@codeaurora.org</a>
                wrote:<br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <blockquote class="gmail_quote" style="margin:0 0 0
                    .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      On Apr 10, 2015, at 9:25 AM, <a
                        moz-do-not-send="true"
                        href="mailto:betulb@codeaurora.org"
                        target="_blank">betulb@codeaurora.org</a> wrote:<br>
                      <br>
                      <blockquote class="gmail_quote" style="margin:0 0
                        0 .8ex;border-left:1px #ccc
                        solid;padding-left:1ex">
                        On 04/09/2015 11:06 AM, Betul Buyukkurt wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          In <a moz-do-not-send="true"
                            href="http://reviews.llvm.org/D8908#153838"
                            target="_blank">http://reviews.llvm.org/D8908#153838</a>,
                          @reames wrote:<br>
                          <br>
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex">
                            Have the IR level construct patches made it
                            up for review?  If so,<br>
                            can<br>
                          </blockquote>
                          So far I've posted two patches. These two
                          patches should apply cleanly<br>
                          to the tip, working with the present profile
                          infrastructure. The next<br>
                          set of patches will be the enabler ones: i.e.
                          three more patches one<br>
                          for<br>
                          each of clang, llvm and compiler-rt. Clang
                          patch will be up for review<br>
                          later today.<br>
                          <br>
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex">
                            you send me a link?  I managed to miss them.<br>
                          </blockquote>
                          So far there is this patch and the instrinsic
                          instruction definitions:<br>
                          <a moz-do-not-send="true"
                            href="http://reviews.llvm.org/D8877"
                            target="_blank">http://reviews.llvm.org/D8877</a>.
                          All patches are necessary for getting<br>
                          the<br>
                          IC targets and having them displayed by the
                          llvm-profdata.<br>
                        </blockquote>
                        Ok, I'm really not convinced that the
                        instrumentation code needs to be<br>
                        or should be an intrinsic.  This seems like
                        something which should be<br>
                        emitted by the frontend and optimized like any
                        other code.  To say this<br>
                        a different way, my instrumentation is going to
                        be entirely different<br>
                        than your instrumentation.<br>
                        <br>
                        Having said that, I really don't care about this
                        part of the proposed<br>
                        changes since they aren't going to impact me at
                        all.  I'm am<br>
                        specifically not objecting to the changes, just
                        commenting.  :)<br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex">
                            I'm assuming this will be some type of per
                            call site metadata?<br>
                          </blockquote>
                          We do assign metadata at the indirect call
                          sites. Format looks like as<br>
                          follows:<br>
                          <br>
                          !33 = metadata !{metadata
                          !"indirect_call_targets", i64<br>
                          <total_exec_count>, metadata
                          !"target_fn1†, i64<br>
                          <target_fn1_count>,<br>
                          metadata !"target_fn2†, i64
                          <target_fn2_count>, Ã¢Â€Â¦.}<br>
                          <br>
                          Currently, we're recording only the top most
                          called five function<br>
                          names<br>
                          at each indirect call site. Following the
                          string literal<br>
                          Ã¢Â€Âœindirect_call_targets† are the
                          fields  <total_exec_count><br>
                          i.e. a 64<br>
                          bit value for the total number of times the
                          indirect call is executed<br>
                          followed by the function names and execution
                          counts of each target.<br>
                        </blockquote>
                        This was the part I was trying to ask about.  I
                        really want to see<br>
                        where<br>
                        you're going with this optimization wise.  My
                        naive guess is that this<br>
                        is going to be slightly off for what you
                        actually want.<br>
                        <br>
                        Assuming you're going for profile guided
                        devirtualization (and thus<br>
                        inlining), being able to check the type of the
                        receiver (as opposed to<br>
                        the result of the virtual lookup) might be
                        advantageous.  (Or, to say<br>
                        it<br>
                        differently, that's what I'm used to seeing. 
                        Your approach might be<br>
                        completely reasonable, it's just not what I'm
                        used to seeing.)  Have<br>
                        you<br>
                        thought about the tradeoffs here?<br>
                      </blockquote>
                      Not sure if I understood the problem here,
                      however, we're recording both<br>
                      the target address and the addresses/names of the
                      instrumented functions<br>
                      during the execution of the instrumented binary.
                      During profile reading<br>
                      these addresses are used to match the target
                      addresses to corresponding<br>
                      functions. During optimization, target function
                      name should be adequate<br>
                      to<br>
                      retrieve the function type information which then
                      can be used to compare<br>
                      the types of the arguments as well as the return
                      type w/ the call site<br>
                      to<br>
                      ensure the match.<br>
                      <br>
                      To clarify, our interest at this time is to
                      upstream all the<br>
                      infrastructure related changes. The optimizations
                      using the profile data<br>
                      will not be part of our initial set of patches. On
                      the other hand,<br>
                      agreeing on the metadata format at the call sites
                      would be advantageous<br>
                      in<br>
                      terms of supporting multiple profile methods.<br>
                      <br>
                      Thanks,<br>
                      -Betul<br>
                    </blockquote>
                    I have the same questions and concerns as Philip. If
                    the goal of this is<br>
                    to support devirtualization, this is not how I would
                    expect it to be done.<br>
                    If you really want something more general, can you
                    provide data to show<br>
                    that it is worth the cost? Recording all that
                    information when profiling<br>
                    will not be cheap, and it adds significant
                    complexity as well.<br>
                  </blockquote>
                  There was an earlier email of mine that went to the
                  llvm-dev mailing list<br>
                  discussing about the implementation dependencies and
                  gains/costs.<br>
                  <a moz-do-not-send="true"
                    href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084271.html"
                    target="_blank">http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084271.html</a><br>
                </blockquote>
              </div>
            </div>
            On that thread, you claim an 8% performance boost on spec. 
            What optimizations did you implement to get that?  Can you
            outline the structure?
            <div class="HOEnZb">
              <div class="h5"><br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <br>
                  <blockquote class="gmail_quote" style="margin:0 0 0
                    .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    I also don’t think we should consider taking all
                    of your infrastructure<br>
                    changes without first seeing at least preliminary
                    versions of the<br>
                    optimization patches that demonstrate the benefits.<br>
                    <br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <blockquote class="gmail_quote" style="margin:0 0
                        0 .8ex;border-left:1px #ccc
                        solid;padding-left:1ex">
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          Thanks.<br>
                          -Betul<br>
                          <br>
                          <br>
                          <a moz-do-not-send="true"
                            href="http://reviews.llvm.org/D8908"
                            target="_blank">http://reviews.llvm.org/D8908</a>
                          <<a moz-do-not-send="true"
                            href="http://reviews.llvm.org/D8908"
                            target="_blank">http://reviews.llvm.org/D8908</a>><br>
                          <br>
                          EMAIL PREFERENCES<br>
                             <a moz-do-not-send="true"
                            href="http://reviews.llvm.org/settings/panel/emailpreferences/"
                            target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a><br>
                          <<a moz-do-not-send="true"
                            href="http://reviews.llvm.org/settings/panel/emailpreferences/"
                            target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a>><br>
                        </blockquote>
                      </blockquote>
                    </blockquote>
                  </blockquote>
                  <br>
                </blockquote>
                <br>
                <br>
              </div>
            </div>
            <br>
            _______________________________________________<br>
            llvm-commits mailing list<br>
            <a moz-do-not-send="true"
              href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
            <a moz-do-not-send="true"
              href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits"
              target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>