<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - unnecessary bit-and in pshufb vector ctlz"
href="https://bugs.llvm.org/show_bug.cgi?id=39703">39703</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>unnecessary bit-and in pshufb vector ctlz
</td>
</tr>
<tr>
<th>Product</th>
<td>new-bugs
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>All
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>new bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>danielwatson311@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>htmldeveloper@gmail.com, llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>For SSSE3+, LLVM's ctlz generates a generic algorithm which uses pshufb to
calculate the leading zeros for each nibble of the vector.
pand instructions are used to select the appropriate high or low nibble.
However, for the lower nibbles this is unnecessary because the algorithm later
performs something like `nibble_lzs = if high_nibble != 0, then high_lz, else
high_lz + low_lz`. The value of `low_lz` is only used when the high nibble is
zero and thus the bit-and is unnecessary.
https:://godbolt.org/z/4lkksq
for v16i8
pand xmm3, xmm2 # lo_nib & 0x0f, unnecessary
pshufb xmm4, xmm3 # lo_lz
psrlw xmm0, 4
pand xmm0, xmm2 # hi_nib
pxor xmm2, xmm2 # zero
pcmpeqb xmm2, xmm0 # hi_nib == 0
pand xmm2, xmm4 # if hi_nib != 0, set lo_lz = 0
pshufb xmm1, xmm0 # hi_lz
paddb xmm1, xmm2 # hi_lz + lo_lz</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>