<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><span class="vcard"><a class="email" href="mailto:cycheng@multicorewareinc.com" title="cycheng <cycheng@multicorewareinc.com>"> <span class="fn">cycheng</span></a>
</span> changed
              <a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED FIXED - [ppc] bad code layout causes slower than gcc in 403.gcc"
   href="https://llvm.org/bugs/show_bug.cgi?id=25782">bug 25782</a>
        <br>
             <table border="1" cellspacing="0" cellpadding="8">
          <tr>
            <th>What</th>
            <th>Removed</th>
            <th>Added</th>
          </tr>

         <tr>
           <td style="text-align:right;">Status</td>
           <td>ASSIGNED
           </td>
           <td>RESOLVED
           </td>
         </tr>

         <tr>
           <td style="text-align:right;">Resolution</td>
           <td>---
           </td>
           <td>FIXED
           </td>
         </tr></table>
      <p>
        <div>
            <b><a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED FIXED - [ppc] bad code layout causes slower than gcc in 403.gcc"
   href="https://llvm.org/bugs/show_bug.cgi?id=25782#c2">Comment # 2</a>
              on <a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED FIXED - [ppc] bad code layout causes slower than gcc in 403.gcc"
   href="https://llvm.org/bugs/show_bug.cgi?id=25782">bug 25782</a>
              from <span class="vcard"><a class="email" href="mailto:cycheng@multicorewareinc.com" title="cycheng <cycheng@multicorewareinc.com>"> <span class="fn">cycheng</span></a>
</span></b>
        <pre>Hi Carrot,

Current Machine Block Placement provides a mechanism called "Precise (Loop)
Rotation Cost", it's aimed at this issue, and you can enable it by:
-mllvm -force-precise-rotation-cost=true (after r269267)

Performance data is available here:
(base compile option = -m64 -O3 -mcpu=power8)
<a href="http://reviews.llvm.org/D20017#428394">http://reviews.llvm.org/D20017#428394</a>

403.gcc got improved a lot.
davidxl plans to enable it by default.

I still saw an optimization opportunity, because his current mechanism will
bypass this pattern:

          entry               
            |                 
------> loop.header (body)    
|97%    /       \             
|      /50%      \50%         
--- latch <--- if.then        
       |
       |3%
   loop.end

And I found libquantum, h264ref can get benefit if we rotate loop top for this
pattern.

Anyway, your issue of 403.gcc is solved by "-force-precise-rotation-cost".

CY</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>