<html>

    <head>

      <base href="https://llvm.org/bugs/" />

    </head>

    <body><span class="vcard"><a class="email" href="mailto:cycheng@multicorewareinc.com" title="cycheng <cycheng@multicorewareinc.com>"> <span class="fn">cycheng</span></a>

</span> changed

              <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - [ppc] bad code layout causes slower than gcc in 403.gcc"

   href="https://llvm.org/bugs/show_bug.cgi?id=25782">bug 25782</a>

        <br>

             <table border="1" cellspacing="0" cellpadding="8">

          <tr>

            <th>What</th>

            <th>Removed</th>

            <th>Added</th>

          </tr>

         <tr>

           <td style="text-align:right;">Status</td>

           <td>ASSIGNED

           </td>

           <td>RESOLVED

           </td>

         </tr>

         <tr>

           <td style="text-align:right;">Resolution</td>

           <td>---

           </td>

           <td>FIXED

           </td>

         </tr></table>

      <p>

        <div>

            <b><a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - [ppc] bad code layout causes slower than gcc in 403.gcc"

   href="https://llvm.org/bugs/show_bug.cgi?id=25782#c2">Comment # 2</a>

              on <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - [ppc] bad code layout causes slower than gcc in 403.gcc"

   href="https://llvm.org/bugs/show_bug.cgi?id=25782">bug 25782</a>

              from <span class="vcard"><a class="email" href="mailto:cycheng@multicorewareinc.com" title="cycheng <cycheng@multicorewareinc.com>"> <span class="fn">cycheng</span></a>

</span></b>

        <pre>Hi Carrot,

Current Machine Block Placement provides a mechanism called "Precise (Loop)

Rotation Cost", it's aimed at this issue, and you can enable it by:

-mllvm -force-precise-rotation-cost=true (after r269267)

Performance data is available here:

(base compile option = -m64 -O3 -mcpu=power8)

<a href="http://reviews.llvm.org/D20017#428394">http://reviews.llvm.org/D20017#428394</a>

403.gcc got improved a lot.

davidxl plans to enable it by default.

I still saw an optimization opportunity, because his current mechanism will

bypass this pattern:

          entry               

            |                 

------> loop.header (body)    

|97%    /       \             

|      /50%      \50%         

--- latch <--- if.then        

       |

       |3%

   loop.end

And I found libquantum, h264ref can get benefit if we rotate loop top for this

pattern.

Anyway, your issue of 403.gcc is solved by "-force-precise-rotation-cost".

CY</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>