<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jan 4, 2017 at 12:55 PM, Wei Mi <span dir="ltr"><<a href="mailto:wmi@google.com" target="_blank">wmi@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span class="gmail-">On Wed, Jan 4, 2017 at 11:51 AM, Evgeny Stupachenko via Phabricator <span dir="ltr"><<a href="mailto:reviews@reviews.llvm.org" target="_blank">reviews@reviews.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">evstupac added a comment.<br>
<br>
Quentin,<br>
<br>
I've put first part in a separate review: <a href="https://reviews.llvm.org/D28307" rel="noreferrer" target="_blank">https://reviews.llvm.org/D2830<wbr>7</a><br>
<br>
Wei,<br>
<br>
Did you have a chance to test the patch performance on your benchmarks?<br>
<div class="gmail-m_-891078132269767550HOEnZb"><div class="gmail-m_-891078132269767550h5"><br></div></div></blockquote><div><br></div></span><div>Yes, I run the patch through internal benchmarks. It is flat overall except two regressions. I look into one and I am trying to reduce a testcase from it. Another one is probably from the same cause, but I will verify. </div><div><br></div><div>Thanks,</div><div>Wei. </div><span class="gmail-"><div> </div></span></div></div></div></blockquote><div><br></div><div>The two regressions mentioned above are from the same cause. </div><div><br></div><div>I attach a runable testcase foo.cc which is extracted from an internal benchmark. Compiled with O2, it shows 1.5% degradation with the patch on my sandybridge desktop while for the original benchmark it shows 3% degradation on sandybridge and 5% on ivybridge machine.</div><div><br></div><div>The instruction number for the testcase is actually reduced with the patch, but stalled-cycles-backend is increased significantly because the patch uses many more memory accesses with complex addressing mode. </div><div><br></div><div>Base:<br></div><div><div>.LBB0_12: # Parent Loop BB0_2 Depth=1</div><div> # => This Inner Loop Header: Depth=2</div><div> movsd -24(%rcx), %xmm1 # xmm1 = mem[0],zero</div><div> movsd -16(%rcx), %xmm2 # xmm2 = mem[0],zero</div><div> mulsd -24(%rdx), %xmm1</div><div> addsd %xmm0, %xmm1</div><div> mulsd -16(%rdx), %xmm2</div><div> addsd %xmm1, %xmm2</div><div> movsd -8(%rcx), %xmm1 # xmm1 = mem[0],zero</div><div> mulsd -8(%rdx), %xmm1</div><div> addsd %xmm2, %xmm1</div><div> movsd (%rcx), %xmm0 # xmm0 = mem[0],zero</div><div> mulsd (%rdx), %xmm0</div><div> addsd %xmm1, %xmm0</div><div> addq $32, %rdx</div><div> addq $32, %rcx</div><div> addq $-4, %rdi</div><div> jne .LBB0_12</div></div><div> </div><div>With the patch:</div><div><div>.LBB0_12: # Parent Loop BB0_2 Depth=1</div><div> # => This Inner Loop Header: Depth=2</div><div> movsd -24(%rdi,%rbx,8), %xmm1 # xmm1 = mem[0],zero</div><div> mulsd -24(%rcx,%rbx,8), %xmm1</div><div> addsd %xmm0, %xmm1</div><div> movsd -16(%rdi,%rbx,8), %xmm0 # xmm0 = mem[0],zero</div><div> mulsd -16(%rcx,%rbx,8), %xmm0</div><div> addsd %xmm1, %xmm0</div><div> movsd -8(%rdi,%rbx,8), %xmm1 # xmm1 = mem[0],zero</div><div> mulsd -8(%rcx,%rbx,8), %xmm1</div><div> addsd %xmm0, %xmm1</div><div> movsd (%rdi,%rbx,8), %xmm0 # xmm0 = mem[0],zero</div><div> mulsd (%rcx,%rbx,8), %xmm0</div><div> addsd %xmm1, %xmm0</div><div> addq $4, %rbx</div><div> cmpq %rbx, %rdx</div><div> jne .LBB0_12</div></div><div><br></div><div>Thanks,</div><div>Wei.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="gmail-"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail-m_-891078132269767550HOEnZb"><div class="gmail-m_-891078132269767550h5">
<br>
Repository:<br>
rL LLVM<br>
<br>
<a href="https://reviews.llvm.org/D27695" rel="noreferrer" target="_blank">https://reviews.llvm.org/D2769<wbr>5</a><br>
<br>
<br>
<br>
</div></div></blockquote></span></div><br></div></div>
</blockquote></div><br></div></div>