<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">The attachment to this PR: <a href="https://llvm.org/bugs/show_bug.cgi?id=5615" class="">https://llvm.org/bugs/show_bug.cgi?id=5615</a><div class="">Explains why this could happen.</div><div class=""><br class=""></div><div class="">(This could well be a different case here, but it may be related, or hint toward a similar type of problem).</div><div class=""><br class=""></div><div class="">OTH.</div><div class=""><br class=""></div><div class="">— </div><div class="">Mehdi</div><div class=""><br class=""></div><div class=""><div><blockquote type="cite" class=""><div class="">On Oct 21, 2016, at 7:30 AM, Rafael Espíndola via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org" class="">llvm-commits@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">This is sufficiently crazy that I decided to create an easy reproducible.<br class=""><br class="">I uploaded it to <a href="https://drive.google.com/open?id=0B7iRtublysV6WmZPZzh5LUpSZUU" class="">https://drive.google.com/open?id=0B7iRtublysV6WmZPZzh5LUpSZUU</a><br class=""><br class="">I also tested it on a i7-3840QM where the problem reproduces exactly<br class="">and on a AMD Opteron(tm) Processor 6380 where the two binaries have<br class="">exactly the same performance.<br class=""><br class="">Craig, all that I was able to find about branch prediction alias<br class="">problems was a suggestion on the intel optimization manual to align<br class="">branch targets, but looks like that is not the problem here. Any idea<br class="">if there is anything that can be done to avoid this problem?<br class=""><br class="">Thanks,<br class="">Rafael<br class=""><br class=""><br class="">On 20 October 2016 at 17:09, Rafael Espíndola<br class=""><<a href="mailto:rafael.espindola@gmail.com" class="">rafael.espindola@gmail.com</a>> wrote:<br class=""><blockquote type="cite" class="">I spend most of the day reducing an oddity I noticed while<br class="">benchmarking a small patch.<br class=""><br class="">It turns out that just reordering two adjacent functions can have a<br class="">massive impact on performance. The two binaries are in<br class=""><br class=""><a href="https://drive.google.com/open?id=0B7iRtublysV6VW5VVW1na2N1RGM" class="">https://drive.google.com/open?id=0B7iRtublysV6VW5VVW1na2N1RGM</a><br class=""><br class="">https://drive.google.com/open?id=0B7iRtublysV6MUJoeGVCRHpXVUU<br class=""><br class="">And the total diff of the objdump is attached.<br class=""><br class="">When linking xul with one of the binaries I get<br class=""><br class="">98,298,725      branch-misses             #    2.24% of all branches<br class="">7.206486289 seconds time elapsed<br class=""><br class="">With the other I get<br class=""><br class="">139,849,372      branch-misses             #    3.18% of all branches<br class="">7.645573494 seconds time elapsed<br class=""><br class="">Adding enough padding before the function gets the performance back,<br class="">which suggests an aliasing problem in the branch predictor.<br class=""><br class="">The cpu is a E5-2697 (Ivy Bridge). Is anyone familiar with its branch<br class="">predictor and how to avoid hitting these problems?<br class=""><br class="">Cheers,<br class="">Rafael<br class=""></blockquote>_______________________________________________<br class="">llvm-commits mailing list<br class=""><a href="mailto:llvm-commits@lists.llvm.org" class="">llvm-commits@lists.llvm.org</a><br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits<br class=""></div></div></blockquote></div><br class=""></div></body></html>