<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Missed Optimization: CLZ Loop -> CLZ instruction"
href="https://bugs.llvm.org/show_bug.cgi?id=48404">48404</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Missed Optimization: CLZ Loop -> CLZ instruction
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>All
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Scalar Optimizations
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>haneef503@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Both of these are equivalent (the latter uses a builtin compiler intrinsic),
but only the latter emits the platform native CLZ instruction:
C(++):
unsigned clz_a (unsigned x) {
unsigned w = sizeof (x) * 8;
while (x) {
w--;
x >>= 1;
}
return w;
}
unsigned clz_b (unsigned x) {
return __builtin_clzll (x);
}
x86-64 (clang -O3 -march=skylake) Assembly:
clz_a(unsigned int): # @clz_a(unsigned int)
mov eax, 32
test edi, edi
je .LBB0_2
.LBB0_1: # =>This Inner Loop Header: Depth=1
dec eax
shr edi
jne .LBB0_1
.LBB0_2:
ret
clz_b(unsigned int): # @clz_b(unsigned int)
mov eax, edi
lzcnt rax, rax
ret
A potential issue is that the intrinsic __builtin_clz(x) comes from GCC, where
they've stated that the behavior of the intrinsic is UB iff (x == 0), likely
since the old x86 BSR instruction yields an undefined result iff (x == 0).
However, it should be straightforward for LLVM to make __builtin_clz(x) well
defined even when (x == 0) by simply emitting a CMOV with the appropriate with
in conjunction when compiling for x86 uarchs too old to have LZCNT support.
AFAICT all other platforms have well defined results for all inputs to their
CLZ instructions.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>