<div dir="ltr">Thanks all for the comment. Any other comments on how we should proceed with this?<div><br></div><div>Thanks,</div><div>Dehao</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 22, 2017 at 9:57 AM, Dehao Chen <span dir="ltr"><<a href="mailto:dehao@google.com" target="_blank">dehao@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Fri, May 19, 2017 at 4:01 PM, Adam Nemet <span dir="ltr"><<a href="mailto:anemet@apple.com" target="_blank">anemet@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><div><div class="m_-1810158140467190907h5"><blockquote type="cite"><div>On May 18, 2017, at 3:30 PM, Dehao Chen via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="m_-1810158140467190907m_-8954505038834062460Apple-interchange-newline"><div><div dir="ltr">Hi,<div><br></div><div>I'm proposing to make vectorizer-maximize-bandw<wbr>idth on by default for loop vectorizer because it should generally help performance.</div><div><br></div><div>I've tested the performance impact on Intel sandybridge machine with speccpu benchmarks:</div><div><br></div><div><div>           Benchmark             Base:Reference   (1)  </div><div>------------------------------<wbr>-------------------------</div><div>spec/2006/fp/C++/444.namd                 26.84  -0.31%</div><div>spec/2006/fp/C++/447.dealII               46.19  +0.89%</div><div>spec/2006/fp/C++/450.soplex               42.92  -0.44%</div><div>spec/2006/fp/C++/453.povray               38.57  -2.25%</div><div>spec/2006/fp/C/433.milc                   24.54  -0.76%</div><div>spec/2006/fp/C/470.lbm                    41.08  +0.26%</div><div>spec/2006/fp/C/482.sphinx3                47.58  -0.99%</div><div>spec/2006/int/C++/471.omnetpp             22.06  +1.87%</div><div>spec/2006/int/C++/473.astar               22.65  -0.12%</div><div>spec/2006/int/C++/483.xalancbm<wbr>k           33.69  +4.97%</div><div>spec/2006/int/C/400.perlbench             33.43  +1.70%</div><div>spec/2006/int/C/401.bzip2                 23.02  -0.19%</div><div>spec/2006/int/C/403.gcc                   32.57  -0.43%</div><div>spec/2006/int/C/429.mcf                   40.35  +0.27%</div><div>spec/2006/int/C/445.gobmk                 26.96  +0.06%</div><div>spec/2006/int/C/456.hmmer                  24.4  +0.19%</div><div>spec/2006/int/C/458.sjeng                 27.91  -0.08%</div><div>spec/2006/int/C/462.libquantum            57.47  -0.20%</div><div>spec/2006/int/C/464.h264ref               46.52  +1.35%</div><div><br></div><div>geometric mean                                   +0.29%</div><div><br></div><div>  Scores are benchmark specific.</div></div><div><br></div><div>We do have regression on 453.povray, but it's due to secondary effects as all hot functions are the same. I've also tested the code size impact, it does not change for tested speccpu benchmarks.</div></div></div></blockquote><div><br></div></div></div><div>Can you please describe the config for the runs (optimization level, PGO/no-PGO, etc).</div></div></div></blockquote><div><br></div></div></div><div>This is O2 build without PGO.</div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div><br></div><div>It would be good to provide analysis for the changes >1%. I.e. we need to make sure that the improvements are not noise either ;).</div></div></div></blockquote><div><br></div></span><div>Good point. I just examined all benchmarks with >1% "improvement". Turns out they are all noises: the hot functions (with >1% total cycles) are all identical. So the conclusion is: this change does not affect speccpu2006 performance.</div><div><br></div><div>Thanks,</div><div>Dehao</div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><div><br></div><blockquote type="cite"><div><div dir="ltr"><div><br></div><div>I've prepared <a href="https://reviews.llvm.org/D33341" rel="noreferrer" style="font-size:12.8px" target="_blank">https://reviews.llvm.<wbr>org/D33341</a> to do this.</div><div><br></div><div>I really appreciate if the community can help test the performance impact of this change on other architectures so that we can decide if this should go target-dependent.</div></div></div></blockquote><div><br></div></span><div>I will run it on Cyclone/AArch64 next week.</div><span class="m_-1810158140467190907HOEnZb"><font color="#888888"><div><br></div><div>Adam</div></font></span><span><br><blockquote type="cite"><div><div dir="ltr"><div><br></div><div>Any comments/concerns?</div><div><br></div><div>Thanks,</div><div>Dehao</div></div>

______________________________<wbr>_________________<br>LLVM Developers mailing list<br><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br></div></blockquote></span></div><br></div></blockquote></span></div><br></div></div>

</blockquote></div><br></div>