<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">Hi, Sean:<br>

      <br>

        I'm sorry I lie.  I didn't mean to lie. I did try to avoid

      making a *BIG* change<br>

      to the IPO pass-ordering for now. However, when I make a minor

      change to  <br>

      populateLTOPassManager() by separating module-pass and

      non-module-passes, I<br>

      saw quite a few performance difference, most of them are

      degradations. Attacking<br>

      these degradations one by one in a piecemeal manner is wasting

      time. We might as<br>

      well define the pass-ordering for Pre-IPO, IPO and Post-IPO phases

      at this time,<br>

      and hopefully once for all.<br>

         <br>

       In order to repair the image of being a liar, I post some

      preliminary result in this cozy<br>

      Saturday afternoon which I normally denote to daydreaming :-) <br>

      <br>

       So far I only measure the result of MultiSource benchmarks on my

      iMac (late<br>

      2012 model), and the command to run the benchmark is  <br>

       "make TEST=simple report OPTFLAGS='-O3 -flto'".<br>

      <br>

       In terms of execution-time, some degrade, but more improve, few

      of them <br>

      are quite substantial. User-time is used for comparison. I measure

      the <br>

      result twice, they are basically very stable. As far as I can tell

      from the result,<br>

      the proposed pass-ordering is basically toward good change. <br>

      <br>

       Interesting enough, if I combine the populatePreIPOPassMgr() as

      the preIPO phase <br>

      (see the patch) with original populateLTOPassManager() for both

      IPO and postIPO, <br>

      I see significant improve to

      "Benchmarks/Trimaran/netbench-crc/netbench-crc" <br>

      (about 94%, 0.5665s(was) vs 0.0295s), as of I write this mail, I

      have not yet got chance<br>

      to figure out why this combination improves this benchmark this

      much.<br>

      <br>

       In teams of compile-time, the result reports my change improve

      the compile<br>

      time by about 2x, which is non-sense. I guess test-script doesn't

      count <br>

      link-time.<br>

      <br>

        The new pass ordering Pre-IPO, IPO, and PostIPO are defined by 

      <br>

      populate{PreIPO|IPO|PostIPO}PassMgr().<br>

      <br>

        I will discuss with Andy next Monday in order to be consistent

      with the <br>

      pass-ordering design he is envisioning, and measure more

      benchmarks then <br>

      post the patch and result to the community for discussion and

      approval.<br>

      <br>

      Thanks<br>

      Shuxin<br>

      <br>

      <br>

      On 7/17/13 7:09 PM, Shuxin Yang wrote:<br>

    </div>

    <blockquote cite="mid:51E74E72.5000902@gmail.com" type="cite">

      <meta content="text/html; charset=ISO-8859-1"

        http-equiv="Content-Type">

      <div class="moz-cite-prefix">Andy and I briefly discussed this 

        the other day, we have not yet got chance to list a detailed

        pass order <br>

        for the pre- and post- IPO scalar optimizations. <br>

        <br>

        This is wish-list in our mind:<br>

        <br>

        pre-IPO:  based on the ordering he propose, get rid of the

        inlining (or just inline tiny func), get rid of <br>

                       all loop xforms...<br>

        <br>

        post-IPO: get rid of inlining, or maybe we still need it, only

        perform the inling to to callee which now become tiny.<br>

                       enable the loop xforms.<br>

        <br>

                        The SCC pass manager seems to be important

        inling,  no matter how the inling looks like in the future, <br>

                        I think the passmanager is still useful for

        scalar opt.  It enable us to achieve cheap inter-procedural <br>

                        opt hands down in the sense that we can optimize

        callee, analyze it, and feedback the detailed whatever<br>

                        info  back to caller (say info like "the callee

        already return constant 5", the "callee return value in 5-10", <br>

                        and such info is difficult to obtain and IPO

        stage, as it can not afford to take such closer look. <br>

        <br>

        I think it is too early to discuss the pre-IPO and post-IPO

        thing, let us focus on what Andy is proposing. <br>

        <br>

        <br>

        On 7/17/13 6:04 PM, Sean Silva wrote:<br>

      </div>

      <blockquote

cite="mid:CAHnXoamdpOkMjYyegBVyz-5Kcu9zc582-5wK6Yw=Jc=TiyCvxA@mail.gmail.com"

        type="cite">

        <div dir="ltr">There seems to be a lot of interest recently in

          LTO. How do you see the situation of splitting the IR passes

          between per-TU processing and multi-TU ("link time")

          processing?

          <div>

            <div><br>

            </div>

            <div> -- Sean Silva</div>

          </div>

        </div>

        <br>

        <fieldset class="mimeAttachmentHeader"></fieldset>

        <br>

        <pre wrap="">_______________________________________________

LLVM Developers mailing list

<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a>

<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>

</pre>

      </blockquote>

      <br>

    </blockquote>

    <br>

  </body>

</html>