<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><span class="vcard"><a class="email" href="mailto:atdt@google.com" title="Ori Livneh <atdt@google.com>"> <span class="fn">Ori Livneh</span></a>
</span> changed
<a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED - Clang is not aware of a false dependency of LZCNT, TZCNT, POPCNT on destination register on some Intel CPUs"
href="https://bugs.llvm.org/show_bug.cgi?id=33869">bug 33869</a>
<br>
<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>What</th>
<th>Removed</th>
<th>Added</th>
</tr>
<tr>
<td style="text-align:right;">Resolution</td>
<td>FIXED
</td>
<td>---
</td>
</tr>
<tr>
<td style="text-align:right;">Status</td>
<td>RESOLVED
</td>
<td>REOPENED
</td>
</tr></table>
<p>
<div>
<b><a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED - Clang is not aware of a false dependency of LZCNT, TZCNT, POPCNT on destination register on some Intel CPUs"
href="https://bugs.llvm.org/show_bug.cgi?id=33869#c24">Comment # 24</a>
on <a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED - Clang is not aware of a false dependency of LZCNT, TZCNT, POPCNT on destination register on some Intel CPUs"
href="https://bugs.llvm.org/show_bug.cgi?id=33869">bug 33869</a>
from <span class="vcard"><a class="email" href="mailto:atdt@google.com" title="Ori Livneh <atdt@google.com>"> <span class="fn">Ori Livneh</span></a>
</span></b>
<pre>As far as I can tell, clang still does not break the dependency in the
reproduction case I attached in <a href="show_bug.cgi?id=33869#c14">comment 14</a>. Minimally:
#include <cstdint>
#include <x86intrin.h>
__attribute__((noinline))
int msb(uint64_t n) {
return 63 ^ __builtin_clzll(n);
}
clang version 7.0.0 (trunk 327823), -O2 -march=haswell:
lzcnt rax, rdi
xor eax, 63
ret
g++ 8.0.1 20180319, -O2 -march=haswell:
xor eax, eax
lzcnt rax, rdi
xor eax, 63
ret
<a href="https://godbolt.org/g/JC57Ri">https://godbolt.org/g/JC57Ri</a>
The failure to break the dependency chain causes a measurable degradation in
performance when the function is called in a loop. I tested on one Haswell
machine and one Broadwell machine.
Worse, clang is shooting itself in the foot. If you compile the same code but
target an older microarchitecture w/no lzcnt (-march=core-i7 for example),
clang emits a bsr instruction instead, which doesn't appear to suffer from this
false dependency issue.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>