<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 13, 2015 at 1:51 PM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000"><span class="">
    <br>
    <div>On 04/13/2015 11:20 AM, Xinliang David
      Li wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">Indirect call promotion + inlining does have very
        large performance benefit. IIRC, it improves eon, h264ref,
        povray, gap, ......etc. Many requires cross module inlining too.
        For instance povray improves >20% (if my memory does not
        betray me).  Our internal programs all show large improvement
        with it.</div>
    </blockquote></span>
    I am not questioning the benefit of profile guided devirtualization
    and inlining.  Getting these optimizations implemented is a clear
    win if done right.  <br>
    <br>
    I am seeking clarification on the approach being taken to implement
    them and the design choices being proposed.  That's a different
    matter.  <br></div></blockquote><div><br></div><div>Sure. However in order to demostrate the full benefit of the value profiling transformations, many others may need to be in place, for instance making inlining aware of profile data, which is currently missing.</div><div><br></div><div>For value profiling, I think it is more important to have metrics on the instrumentation side -- the low memory overhead, low runtime overhead will be obvious ones.</div><div><br></div><div>David</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><div><div class="h5">
    <blockquote type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>David</div>
        <div><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Mon, Apr 13, 2015 at 11:02 AM,
          Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div>
              <div><br>
                On 04/13/2015 10:57 AM, <a href="mailto:betulb@codeaurora.org" target="_blank">betulb@codeaurora.org</a>
                wrote:<br>
                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      On Apr 10, 2015, at 9:25 AM, <a href="mailto:betulb@codeaurora.org" target="_blank">betulb@codeaurora.org</a> wrote:<br>
                      <br>
                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                        On 04/09/2015 11:06 AM, Betul Buyukkurt wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          In <a href="http://reviews.llvm.org/D8908#153838" target="_blank">http://reviews.llvm.org/D8908#153838</a>,
                          @reames wrote:<br>
                          <br>
                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            Have the IR level construct patches made it
                            up for review?  If so,<br>
                            can<br>
                          </blockquote>
                          So far I've posted two patches. These two
                          patches should apply cleanly<br>
                          to the tip, working with the present profile
                          infrastructure. The next<br>
                          set of patches will be the enabler ones: i.e.
                          three more patches one<br>
                          for<br>
                          each of clang, llvm and compiler-rt. Clang
                          patch will be up for review<br>
                          later today.<br>
                          <br>
                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            you send me a link?  I managed to miss them.<br>
                          </blockquote>
                          So far there is this patch and the instrinsic
                          instruction definitions:<br>
                          <a href="http://reviews.llvm.org/D8877" target="_blank">http://reviews.llvm.org/D8877</a>.
                          All patches are necessary for getting<br>
                          the<br>
                          IC targets and having them displayed by the
                          llvm-profdata.<br>
                        </blockquote>
                        Ok, I'm really not convinced that the
                        instrumentation code needs to be<br>
                        or should be an intrinsic.  This seems like
                        something which should be<br>
                        emitted by the frontend and optimized like any
                        other code.  To say this<br>
                        a different way, my instrumentation is going to
                        be entirely different<br>
                        than your instrumentation.<br>
                        <br>
                        Having said that, I really don't care about this
                        part of the proposed<br>
                        changes since they aren't going to impact me at
                        all.  I'm am<br>
                        specifically not objecting to the changes, just
                        commenting.  :)<br>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            I'm assuming this will be some type of per
                            call site metadata?<br>
                          </blockquote>
                          We do assign metadata at the indirect call
                          sites. Format looks like as<br>
                          follows:<br>
                          <br>
                          !33 = metadata !{metadata
                          !"indirect_call_targets", i64<br>
                          <total_exec_count>, metadata
                          !"target_fn1†, i64<br>
                          <target_fn1_count>,<br>
                          metadata !"target_fn2†, i64
                          <target_fn2_count>, Ã¢Â€Â¦.}<br>
                          <br>
                          Currently, we're recording only the top most
                          called five function<br>
                          names<br>
                          at each indirect call site. Following the
                          string literal<br>
                          Ã¢Â€Âœindirect_call_targets† are the
                          fields  <total_exec_count><br>
                          i.e. a 64<br>
                          bit value for the total number of times the
                          indirect call is executed<br>
                          followed by the function names and execution
                          counts of each target.<br>
                        </blockquote>
                        This was the part I was trying to ask about.  I
                        really want to see<br>
                        where<br>
                        you're going with this optimization wise.  My
                        naive guess is that this<br>
                        is going to be slightly off for what you
                        actually want.<br>
                        <br>
                        Assuming you're going for profile guided
                        devirtualization (and thus<br>
                        inlining), being able to check the type of the
                        receiver (as opposed to<br>
                        the result of the virtual lookup) might be
                        advantageous.  (Or, to say<br>
                        it<br>
                        differently, that's what I'm used to seeing. 
                        Your approach might be<br>
                        completely reasonable, it's just not what I'm
                        used to seeing.)  Have<br>
                        you<br>
                        thought about the tradeoffs here?<br>
                      </blockquote>
                      Not sure if I understood the problem here,
                      however, we're recording both<br>
                      the target address and the addresses/names of the
                      instrumented functions<br>
                      during the execution of the instrumented binary.
                      During profile reading<br>
                      these addresses are used to match the target
                      addresses to corresponding<br>
                      functions. During optimization, target function
                      name should be adequate<br>
                      to<br>
                      retrieve the function type information which then
                      can be used to compare<br>
                      the types of the arguments as well as the return
                      type w/ the call site<br>
                      to<br>
                      ensure the match.<br>
                      <br>
                      To clarify, our interest at this time is to
                      upstream all the<br>
                      infrastructure related changes. The optimizations
                      using the profile data<br>
                      will not be part of our initial set of patches. On
                      the other hand,<br>
                      agreeing on the metadata format at the call sites
                      would be advantageous<br>
                      in<br>
                      terms of supporting multiple profile methods.<br>
                      <br>
                      Thanks,<br>
                      -Betul<br>
                    </blockquote>
                    I have the same questions and concerns as Philip. If
                    the goal of this is<br>
                    to support devirtualization, this is not how I would
                    expect it to be done.<br>
                    If you really want something more general, can you
                    provide data to show<br>
                    that it is worth the cost? Recording all that
                    information when profiling<br>
                    will not be cheap, and it adds significant
                    complexity as well.<br>
                  </blockquote>
                  There was an earlier email of mine that went to the
                  llvm-dev mailing list<br>
                  discussing about the implementation dependencies and
                  gains/costs.<br>
                  <a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084271.html" target="_blank">http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084271.html</a><br>
                </blockquote>
              </div>
            </div>
            On that thread, you claim an 8% performance boost on spec. 
            What optimizations did you implement to get that?  Can you
            outline the structure?
            <div>
              <div><br>
                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <br>
                  <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    I also don’t think we should consider taking all
                    of your infrastructure<br>
                    changes without first seeing at least preliminary
                    versions of the<br>
                    optimization patches that demonstrate the benefits.<br>
                    <br>
                    <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          Thanks.<br>
                          -Betul<br>
                          <br>
                          <br>
                          <a href="http://reviews.llvm.org/D8908" target="_blank">http://reviews.llvm.org/D8908</a>
                          <<a href="http://reviews.llvm.org/D8908" target="_blank">http://reviews.llvm.org/D8908</a>><br>
                          <br>
                          EMAIL PREFERENCES<br>
                             <a href="http://reviews.llvm.org/settings/panel/emailpreferences/" target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a><br>
                          <<a href="http://reviews.llvm.org/settings/panel/emailpreferences/" target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a>><br>
                        </blockquote>
                      </blockquote>
                    </blockquote>
                  </blockquote>
                  <br>
                </blockquote>
                <br>
                <br>
              </div>
            </div>
            <br>
            _______________________________________________<br>
            llvm-commits mailing list<br>
            <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
            <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div></div>