[llvm-bugs] [Bug 48404] New: Missed Optimization: CLZ Loop -> CLZ instruction
via llvm-bugs
llvm-bugs at lists.llvm.org
Sat Dec 5 16:18:17 PST 2020
https://bugs.llvm.org/show_bug.cgi?id=48404
Bug ID: 48404
Summary: Missed Optimization: CLZ Loop -> CLZ instruction
Product: libraries
Version: trunk
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: Scalar Optimizations
Assignee: unassignedbugs at nondot.org
Reporter: haneef503 at gmail.com
CC: llvm-bugs at lists.llvm.org
Both of these are equivalent (the latter uses a builtin compiler intrinsic),
but only the latter emits the platform native CLZ instruction:
C(++):
unsigned clz_a (unsigned x) {
unsigned w = sizeof (x) * 8;
while (x) {
w--;
x >>= 1;
}
return w;
}
unsigned clz_b (unsigned x) {
return __builtin_clzll (x);
}
x86-64 (clang -O3 -march=skylake) Assembly:
clz_a(unsigned int): # @clz_a(unsigned int)
mov eax, 32
test edi, edi
je .LBB0_2
.LBB0_1: # =>This Inner Loop Header: Depth=1
dec eax
shr edi
jne .LBB0_1
.LBB0_2:
ret
clz_b(unsigned int): # @clz_b(unsigned int)
mov eax, edi
lzcnt rax, rax
ret
A potential issue is that the intrinsic __builtin_clz(x) comes from GCC, where
they've stated that the behavior of the intrinsic is UB iff (x == 0), likely
since the old x86 BSR instruction yields an undefined result iff (x == 0).
However, it should be straightforward for LLVM to make __builtin_clz(x) well
defined even when (x == 0) by simply emitting a CMOV with the appropriate with
in conjunction when compiling for x86 uarchs too old to have LZCNT support.
AFAICT all other platforms have well defined results for all inputs to their
CLZ instructions.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20201206/5dec7653/attachment.html>
More information about the llvm-bugs
mailing list