<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In <a href="https://reviews.llvm.org/D28368" class="">https://reviews.llvm.org/D28368</a>, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks:<div class=""><br class=""></div><div class=""><p style="margin: 0px 0px 12px; padding: 0px; border: 0px; font-family: 'segoe ui', 'segoe ui web regular', 'segoe ui symbol', lato, 'helvetica neue', helvetica, arial, sans-serif; font-size: 13px;" class="">Code size:<br style="margin-top:0px" class="">447.dealII 0.50%<br class="">453.povray 0.42%<br class="">433.milc 0.20%<br class="">445.gobmk 0.32%<br class="">403.gcc 0.05%<br class="">464.h264ref 3.62%</p><p style="margin: 0px 0px 12px; padding: 0px; border: 0px; font-family: 'segoe ui', 'segoe ui web regular', 'segoe ui symbol', lato, 'helvetica neue', helvetica, arial, sans-serif; font-size: 13px;" class="">Compile Time:<br style="margin-top:0px" class="">447.dealII 0.22%<br class="">453.povray -0.16%<br class="">433.milc 0.09%<br class="">445.gobmk -2.43%<br class="">403.gcc 0.06%<br class="">464.h264ref 3.21%</p><p style="margin: 0px 0px 12px; padding: 0px; border: 0px; font-family: 'segoe ui', 'segoe ui web regular', 'segoe ui symbol', lato, 'helvetica neue', helvetica, arial, sans-serif; font-size: 13px;" class="">Performance (on intel sandybridge):<br style="margin-top:0px" class="">447.dealII +0.07%<br class="">453.povray +1.79%<br class="">433.milc +1.02%<br class="">445.gobmk +0.56%<br class="">403.gcc -0.16%<br class="">464.h264ref -0.41%</p><p style="margin: 0px 0px 12px; padding: 0px; border: 0px; font-family: 'segoe ui', 'segoe ui web regular', 'segoe ui symbol', lato, 'helvetica neue', helvetica, arial, sans-serif; font-size: 13px;" class="">Looks like the change has overall positive performance impact with very small code size/compile time overhead. Now the question is shall we make this change default in O2, or shall we leave it in O3. We would like to have more input from the community to make the decision.</p></div></div></div></blockquote><div>Intuitively (correct me if I am wrong) I would think loop unrolling is a more risky operation that is good in many/most cases but can be detrimental to performance (by blowing up code sizes and I-Caches). So I would rather put that into -O3.</div><div><br class=""></div><div>- Matthias</div><div><br class=""></div></div></body></html>