<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - False dependency in x86 popcnt instruction unknown to llvm causes slow code"
href="https://bugs.llvm.org/show_bug.cgi?id=34936">34936</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>False dependency in x86 popcnt instruction unknown to llvm causes slow code
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>justin.lebar@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>clang/LLVM at head seems to be affected by the bug described here:
<a href="https://stackoverflow.com/questions/25078285/replacing-a-32-bit-loop-count-variable-with-64-bit-introduces-crazy-performance/25089720#25089720">https://stackoverflow.com/questions/25078285/replacing-a-32-bit-loop-count-variable-with-64-bit-introduces-crazy-performance/25089720#25089720</a>
The etiology established in the SO post is that in the hardware, "popcount dst,
src" has a false dependency on dst. If the compiler isn't aware of this, it
makes bad decisions during register assignment.
$ curl
<a href="https://gist.githubusercontent.com/anonymous/31cb15567b89f461534fcb97957b5">https://gist.githubusercontent.com/anonymous/31cb15567b89f461534fcb97957b5</a>
369/raw/ec4705c992f355258c292da5ba21ca0c7abaa119/- | clang++ -O3 -march=haswell
--std=c++11 -x c++ - -o test && ./test 1
On a Haswell machine, I get
unsigned 41959360000 0.592057 sec 17.7107 GB/s
uint64_t 41959360000 0.823331 sec 12.7358 GB/s
which exhibits the bug by being significantly slower in the case where the loop
induction variable is uint64_t.
Disassembly is at
<a href="https://gist.github.com/anonymous/47496363b7a4f15ffd57038492afb3e3">https://gist.github.com/anonymous/47496363b7a4f15ffd57038492afb3e3</a> -- based on
my (nonexpert) analysis, it seems plausible that the etiology from SO applies
here.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>