<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><span class="vcard"><a class="email" href="mailto:bruno.cardoso@gmail.com" title="Bruno Cardoso Lopes <bruno.cardoso@gmail.com>"> <span class="fn">Bruno Cardoso Lopes</span></a>
</span> changed
              <a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED INVALID - Missed optimization by O2? LTO generates faster code than O2 with all code visible"
   href="http://llvm.org/bugs/show_bug.cgi?id=21201">bug 21201</a>
        <br>
             <table border="1" cellspacing="0" cellpadding="8">
          <tr>
            <th>What</th>
            <th>Removed</th>
            <th>Added</th>
          </tr>

         <tr>
           <td style="text-align:right;">Status</td>
           <td>NEW
           </td>
           <td>RESOLVED
           </td>
         </tr>

         <tr>
           <td style="text-align:right;">CC</td>
           <td>
                
           </td>
           <td>bruno.cardoso@gmail.com
           </td>
         </tr>

         <tr>
           <td style="text-align:right;">Resolution</td>
           <td>---
           </td>
           <td>INVALID
           </td>
         </tr></table>
      <p>
        <div>
            <b><a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED INVALID - Missed optimization by O2? LTO generates faster code than O2 with all code visible"
   href="http://llvm.org/bugs/show_bug.cgi?id=21201#c2">Comment # 2</a>
              on <a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED INVALID - Missed optimization by O2? LTO generates faster code than O2 with all code visible"
   href="http://llvm.org/bugs/show_bug.cgi?id=21201">bug 21201</a>
              from <span class="vcard"><a class="email" href="mailto:bruno.cardoso@gmail.com" title="Bruno Cardoso Lopes <bruno.cardoso@gmail.com>"> <span class="fn">Bruno Cardoso Lopes</span></a>
</span></b>
        <pre>These are the results reproduced locally:
flags, random, in-order
tot-O2, 722.680343, 358.527025
tot-O2-flto, 586.257256, 113.567324
tot-O3, 763.348347, 340.556150

The only passes that run in LTO but do not run during non-LTO compilation are:

-argpromotion, -globalsmodref-aa, -internalize

The great responsible for the speedup is indeed -internalize, results bellow: 

== llvm-lto flags, random, in-order
default, 551.342890, 107.292317
-nointernalize, 811.046609, 420.142491
-nointernalize-noglobalsmodref-aa, 779.453545, 412.112992
-nointernalize-noglobalsmodref-aa-noargsprom, 825.279406, 433.142810

Note that removing other passes yield the same results - the values above
all fall within the -nointernalize SD. With -internalize we're able to
delete 20 functions and inline 43, see the stats:

== llvm-lto
64 internalize       - Number of functions internalized
50 internalize       - Number of global vars internalized
150 inline            - Number of caller-callers analyzed
 20 inline            - Number of functions deleted because all callers found
 43 inline            - Number of functions inlined

== llvm-lto without -internalize 
no internalize results
no inline results

That said, nothing to do here anymore since -internalize only makes sense
during LTO.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>