<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <div class="moz-cite-prefix">On 01/12/2015 07:27 PM, Xinliang David
      Li wrote:<br>
    </div>
    <blockquote
cite="mid:CALRgJCO8R2c5iZp6M45HMX_jgwP_MZY2FHQGbL8kkR+SJUxk_A@mail.gmail.com"
      type="cite">
      <div dir="ltr"><br>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On Wed, Jan 7, 2015 at 5:19 PM,
            Philip Reames <span dir="ltr"><<a moz-do-not-send="true"
                href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div>
                  <div>The basic idea of this patch is to use profiling
                    information about the frequency of a backedges to
                    separate a loop with multiple latches into a loop
                    nest with the rare backedge being the outer loop. We
                    already use a set of static heuristics based on
                    non-varying arguments to PHI nodes to do the same
                    type of thing.
                    <p>The motivation is that doing this pulls rarely
                      executed code out of the innermost loop and tends
                      to greatly simplify analysis and optimization of
                      that loop.</p>
                  </div>
                </div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>This is good thing to do from the code/block placement
              point of view -- to improve icache utilization. Code
              layout is one of the most important passes that can
              benefit from profile data. <br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    We already try to do that using profile information at a later
    point.  I don't know how effective that is and how much room we
    might have to improve, but that really wasn't a motivation for this
    approach.  <br>
    <blockquote
cite="mid:CALRgJCO8R2c5iZp6M45HMX_jgwP_MZY2FHQGbL8kkR+SJUxk_A@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div>
                  <div>
                    <p> In particular, it can enable substantial LICM
                      gains when there are clobbering calls down rare
                      paths. </p>
                  </div>
                </div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>How can this enable more LICM? <br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Given you now have a loop nest with a simpler inner loop, we can
    sometimes perform LICM on that inner loop.<br>
    <br>
    Note that this is equivalent to running a smarter PRE on the
    original loop structure.<br>
    <blockquote
cite="mid:CALRgJCO8R2c5iZp6M45HMX_jgwP_MZY2FHQGbL8kkR+SJUxk_A@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>  <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div>
                  <div>
                    <ul>
                      <li>I chose to implement this without relying on
                        the existing block frequency analysis. My
                        reasoning was that a) this is a rarely taken
                        case and adding an expensive analysis dependency
                        probably wasn't worthwhile and b) that block
                        frequency analysis was more expensive/precise
                        than I really needed. Is this reasonable?</li>
                    </ul>
                  </div>
                </div>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>IMO no. Remember your pass should also work with real
              profile data (instrumentation or sample based) -- you
              should  rely on existing profile infrastructure to provide
              what you need. (Ideally you should treat it as something
              that is always available for query).</div>
          </div>
        </div>
      </div>
    </blockquote>
    There's a distinction here between 'source of information' and
    'source of heuristics'.  By accessing the metadata directly, I'm
    still using whatever profile information the user provided.  <br>
    <br>
    I'm currently exploring other approaches to this problem, but if I
    do end up exploring this further, I'll likely rewrite to use one the
    existing analyses.  <br>
    <br>
    Philip<br>
  </body>
</html>