<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <div class="moz-cite-prefix">On 4/24/13 10:56 AM, Chris Lattner

      wrote:<br>

    </div>

    <blockquote

      cite="mid:AE19DACE-53C4-40CD-9FFD-D6433536D080@apple.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=ISO-8859-1">

      <div>Sorry for chiming in late on this thread.  </div>

      <div><br>

      </div>

      <div>MHO is that this is still better to do on IR than in codegen.

         Of course, I'm also of the opinion that we should take

        Chandler's improvement as well (properly done, its been long

        enough ago that I forget the specific objections) because making

        the compiler better shouldn't wait for some theoretical

        improvement that never happens.  More below.</div>

      <div><br>

      </div>

      On Apr 21, 2013, at 1:21 AM, Andrew Trick <<a

        moz-do-not-send="true" href="mailto:atrick@apple.com">atrick@apple.com</a>>

      wrote:

      <div>

        <blockquote type="cite">

          <div style="letter-spacing: normal; orphans: auto; text-align:

            start; text-indent: 0px; text-transform: none; white-space:

            normal; widows: auto; word-spacing: 0px;

            -webkit-text-stroke-width: 0px;">This is different than

            Chandler's case, because we know the MI "early" if-converter

            doesn't currently handle Arnold's optimization. Also, Arnold

            has not yet proposed any target level heuristics that

            attempt to predict cpu behavior, which was the main

            objection.<br>

          </div>

        </blockquote>

        <div><br>

        </div>

        <div>This specific transformation (if converting stores) has two

          goals: 1) improve micro architectural performance

          characteristics (less branch prediction etc), and 2) enable

          mid-level optimizations to remove loads and stores.</div>

        <div><br>

        </div>

        <div>In my mental cost model, eliminating loads and stores is

          always goodness: it is generally always good for performance,

          and it unblocks other secondary optimizations.  They are a

          great canonicalization.</div>

        <div><br>

        </div>

        <div>Because this is a canonicalization of this sort, it seems

          clearly good to do on IR, and early.  Doing something like

          this at the codegen level specifically for micro-architectural

          reasons could also make sense, but I don't see that

          eliminating the usefulness of doing it early as well.</div>

      </div>

    </blockquote>

    Introducing a "select" at IR level dose not necessarily means

    CodeGen convert the "select" with predicated instruction like cmov.<br>

    cmov is not necessary inexpensive, for example, on Pentium 4, the

    latency of cmov is about 10+ cycle. <br>

    On this platform, If the compiler blindly convert a well predictable

    branch to cmov on this platform, it only see degradation.  <br>

    <br>

    That said, I think it makes some sense to perform force-if-cvt at IR

    level if the algorithm rely on straight line code.  <br>

    <br>

    <blockquote

      cite="mid:AE19DACE-53C4-40CD-9FFD-D6433536D080@apple.com"

      type="cite">

      <div>

        <div><br>

        </div>

        <blockquote type="cite">

          <div style="letter-spacing: normal; orphans: auto; text-align:

            start; text-indent: 0px; text-transform: none; white-space:

            normal; widows: auto; word-spacing: 0px;

            -webkit-text-stroke-width: 0px;">That said, we should have a

            reason to if-convert before lowering other than optimizing

            for a machine's cpu pipeline.<br>

            <br>

            Are we all convinced that if-converting a single store is

            the proper canonical form?<br>

          </div>

        </blockquote>

        <div><br>

        </div>

        I am, at least in this specific benchmark's case.  You *can't*

        legally do the if conversion if you are introducing a memory

        access that otherwise would not have done.  Doing this can have

        lots of semantic effects.  In order for this to be *legal* at

        all (ignoring profitability) you have to prove that a subsequent

        store is happening to the memory location.</div>

      <div><br>

      </div>

      <div>In this case, the *profitability* comes down to being able to

        obviously, locally, eliminate a load from the address.  It's

        true that GVN/PRE can eliminate part of the load in principle,

        but in practice this doesn't happen.<br>

      </div>

    </blockquote>

    GVN get rid of all the loads for the cases this if-cvt is trying to

    catch. <br>

  </body>

</html>