[llvm-bugs] [Bug 39703] New: unnecessary bit-and in pshufb vector ctlz
via llvm-bugs
llvm-bugs at lists.llvm.org
Sun Nov 18 20:53:11 PST 2018
https://bugs.llvm.org/show_bug.cgi?id=39703
Bug ID: 39703
Summary: unnecessary bit-and in pshufb vector ctlz
Product: new-bugs
Version: trunk
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: danielwatson311 at gmail.com
CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org
For SSSE3+, LLVM's ctlz generates a generic algorithm which uses pshufb to
calculate the leading zeros for each nibble of the vector.
pand instructions are used to select the appropriate high or low nibble.
However, for the lower nibbles this is unnecessary because the algorithm later
performs something like `nibble_lzs = if high_nibble != 0, then high_lz, else
high_lz + low_lz`. The value of `low_lz` is only used when the high nibble is
zero and thus the bit-and is unnecessary.
https:://godbolt.org/z/4lkksq
for v16i8
pand xmm3, xmm2 # lo_nib & 0x0f, unnecessary
pshufb xmm4, xmm3 # lo_lz
psrlw xmm0, 4
pand xmm0, xmm2 # hi_nib
pxor xmm2, xmm2 # zero
pcmpeqb xmm2, xmm0 # hi_nib == 0
pand xmm2, xmm4 # if hi_nib != 0, set lo_lz = 0
pshufb xmm1, xmm0 # hi_lz
paddb xmm1, xmm2 # hi_lz + lo_lz
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20181119/d92351e2/attachment.html>
More information about the llvm-bugs
mailing list