<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Minor alignment change may cause big performance variation in MultiSource/Benchmarks/Ptrdist/ks"
   href="https://llvm.org/bugs/show_bug.cgi?id=24570">24570</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Minor alignment change may cause big performance variation in MultiSource/Benchmarks/Ptrdist/ks
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>Test Suite
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Programs Tests
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>wmi@google.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>When I patched the change in <a href="http://reviews.llvm.org/D12107">http://reviews.llvm.org/D12107</a> to r243652, I saw
-15% perf regression in MultiSource/Benchmarks/Ptrdist/ks.

Analysis using perf showed the regression was caused by a minor alignment
change. The alignment change prevented a cmp and jmp pair inside the kernel
loop of the test from being macro-fused, which increased uops-retired events a
lot.

With the patch
  4012b0:       48 39 41 08             cmp    %rax,0x8(%rcx)
  4012b4:       75 04                   jne    4012ba <FindMaxGpAndSwap+0xda>
  4012b6:       f3 0f 58 e5             addss  %xmm5,%xmm4
  4012ba:       48 8b 09                mov    (%rcx),%rcx
  4012bd:       48 85 c9                test   %rcx,%rcx
  4012c0:       75 ee                   jne    4012b0 <FindMaxGpAndSwap+0xd0>

Without the patch
  4012a0:       48 39 41 08             cmp    %rax,0x8(%rcx)
  4012a4:       75 04                   jne    4012aa <FindMaxGpAndSwap+0xda>
  4012a6:       f3 0f 58 e5             addss  %xmm5,%xmm4
  4012aa:       48 8b 09                mov    (%rcx),%rcx
  4012ad:       48 85 c9                test   %rcx,%rcx
  4012b0:       75 ee                   jne    4012a0 <FindMaxGpAndSwap+0xd0> 

With the patch, the test and jne pair in the kernel loop crossed 32 bytes
boundary, so they cannot be macro-fused.

The bug was filed to track a perf flaky test on one hand, and to record a
microarchitecture dependent perf tuning opportunity on another hand.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>