<div dir="ltr">Recollected the data from trunk head with stddev data and more threshold data points attached:<div><br></div><div>Performance:</div><div>
<table cellspacing="0" border="0">
<colgroup span="6" width="85"></colgroup>
<tbody><tr>
<td height="17" align="left"><br></td>
<td align="left">stddev/mean</td>
<td align="right">300</td>
<td align="right">450</td>
<td align="right">600</td>
<td align="right">750</td>
</tr>
<tr>
<td height="17" align="right">403</td>
<td align="right">0.37%</td>
<td align="right">0.11%</td>
<td align="right">0.11%</td>
<td align="right">0.09%</td>
<td align="right">0.79%</td>
</tr>
<tr>
<td height="17" align="right">433</td>
<td align="right">0.14%</td>
<td align="right">0.51%</td>
<td align="right">0.25%</td>
<td align="right">-0.63%</td>
<td align="right">-0.29%</td>
</tr>
<tr>
<td height="17" align="right">445</td>
<td align="right">0.08%</td>
<td align="right">0.48%</td>
<td align="right">0.89%</td>
<td align="right">0.12%</td>
<td align="right">0.83%</td>
</tr>
<tr>
<td height="17" align="right">447</td>
<td align="right">0.16%</td>
<td align="right">3.50%</td>
<td align="right">2.69%</td>
<td align="right">3.66%</td>
<td align="right">3.59%</td>
</tr>
<tr>
<td height="17" align="right">453</td>
<td align="right">0.11%</td>
<td align="right">1.49%</td>
<td align="right">0.45%</td>
<td align="right">-0.07%</td>
<td align="right">0.78%</td>
</tr>
<tr>
<td height="17" align="right">464</td>
<td align="right">0.17%</td>
<td align="right">0.75%</td>
<td align="right">1.80%</td>
<td align="right">1.86%</td>
<td align="right">1.54%</td>
</tr>
</tbody></table><br></div><div>Code size:</div><div>
<table cellspacing="0" border="0">
<colgroup span="5" width="85"></colgroup>
<tbody><tr>
<td height="17" align="left"><br></td>
<td align="right">300</td>
<td align="right">450</td>
<td align="right">600</td>
<td align="right">750</td>
</tr>
<tr>
<td height="17" align="right">403</td>
<td align="right">0.56%</td>
<td align="right">2.41%</td>
<td align="right">2.74%</td>
<td align="right">3.75%</td>
</tr>
<tr>
<td height="17" align="right">433</td>
<td align="right">0.96%</td>
<td align="right">2.84%</td>
<td align="right">4.19%</td>
<td align="right">4.87%</td>
</tr>
<tr>
<td height="17" align="right">445</td>
<td align="right">2.16%</td>
<td align="right">3.62%</td>
<td align="right">4.48%</td>
<td align="right">5.88%</td>
</tr>
<tr>
<td height="17" align="right">447</td>
<td align="right">2.96%</td>
<td align="right">5.09%</td>
<td align="right">6.74%</td>
<td align="right">8.89%</td>
</tr>
<tr>
<td height="17" align="right">453</td>
<td align="right">0.94%</td>
<td align="right">1.67%</td>
<td align="right">2.73%</td>
<td align="right">2.96%</td>
</tr>
<tr>
<td height="17" align="right">464</td>
<td align="right">8.02%</td>
<td align="right">13.50%</td>
<td align="right">20.51%</td>
<td align="right">26.59%</td>
</tr>
</tbody></table><br></div><div>Compile time is proportional in the experiments and more noisy, so I did not include it.</div><div><br></div><div>We have >2% speedup on some google internal benchmarks when switching the threshold from 150 to 300.</div><div><br></div><div>Dehao</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 30, 2017 at 5:06 PM, Chandler Carruth <span dir="ltr"><<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><span class=""><div dir="ltr">On Mon, Jan 30, 2017 at 4:59 PM Mehdi Amini <<a href="mailto:mehdi.amini@apple.com" target="_blank">mehdi.amini@apple.com</a>> wrote:<br></div></span><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655gmail_msg"><blockquote type="cite" class="m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655gmail_msg"><br></div></blockquote></div></div><div style="word-wrap:break-word" class="m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655gmail_msg"><blockquote type="cite" class="m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655gmail_msg"><div class="gmail_quote m_-8823773316722963655gmail_msg" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><blockquote class="gmail_quote m_-8823773316722963655gmail_msg" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr" class="m_-8823773316722963655gmail_msg"><div class="gmail_quote m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655m_-6293897042106820945h5 m_-8823773316722963655gmail_msg"><blockquote class="gmail_quote m_-8823773316722963655gmail_msg" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="m_-8823773316722963655m_-6293897042106820945m_-4486181801685859403gmail_msg m_-8823773316722963655gmail_msg" style="word-wrap:break-word"><div class="m_-8823773316722963655m_-6293897042106820945m_-4486181801685859403gmail_msg m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655m_-6293897042106820945m_-4486181801685859403gmail_msg m_-8823773316722963655gmail_msg"><br class="m_-8823773316722963655m_-6293897042106820945m_-4486181801685859403gmail_msg m_-8823773316722963655gmail_msg"></div><div class="m_-8823773316722963655m_-6293897042106820945m_-4486181801685859403gmail_msg m_-8823773316722963655gmail_msg">Another question is about PGO integration: is it already hooked there? Should we have a more aggressive threshold in a hot function? (Assuming we’re willing to spend some binary size there but not on the cold path).</div></div></div></blockquote><div class="m_-8823773316722963655gmail_msg"><br class="m_-8823773316722963655gmail_msg"></div></div></div><div class="m_-8823773316722963655gmail_msg">I would even wire the *unrolling* the other way: just suppress unrolling in cold paths to save binary size. rolled loops seem like a generally good thing in cold code unless they are having some larger impact (IE, the loop itself is more expensive than the unrolled form).</div></div></div></blockquote><div class="m_-8823773316722963655gmail_msg"><br class="m_-8823773316722963655gmail_msg"></div><div class="m_-8823773316722963655gmail_msg"><br class="m_-8823773316722963655gmail_msg"></div><div class="m_-8823773316722963655gmail_msg">Agree that we could suppress unrolling in cold path to save code size. But that's orthogonal with the propose here. This proposal focuses on O2 performance: shall we have different (higher) fully unroll threshold than dynamic/partial unroll.</div></div></div></blockquote><div class="m_-8823773316722963655gmail_msg"><br class="m_-8823773316722963655gmail_msg"></div></div></div><div style="word-wrap:break-word" class="m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655gmail_msg">I agree that this is (to some extent) orthogonal, and it makes sense to me to differentiate the threshold for full unroll and the dynamic/partial case.</div></div></div></blockquote><div><br></div></span><div>There is one issue that makes these not orthogonal.</div><div><br></div><div>If even *static* profile hints will reduce some of the code size increase caused by higher unrolling thresholds for non-cold code, we should factor that into the tradeoff in picking where the threshold goes.</div><div><br></div><div>However, getting PGO into the full unroller is currently challenging outside of the new pass manager. We already have some unfortunate hacks around this in LoopUnswitch that are making the port of it to the new PM more annoying.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655gmail_msg"><blockquote type="cite" class="m_-8823773316722963655gmail_msg"><div class="m_-8823773316722963655gmail_msg"><div class="gmail_quote m_-8823773316722963655gmail_msg" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><blockquote class="gmail_quote m_-8823773316722963655gmail_msg" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr" class="m_-8823773316722963655gmail_msg"><div class="gmail_quote m_-8823773316722963655gmail_msg"><span class="m_-8823773316722963655gmail_msg"></span></div></div></blockquote></div></div></blockquote></div></div></blockquote></div></div>
</blockquote></div><br></div>