[llvm-bugs] [Bug 39703] New: unnecessary bit-and in pshufb vector ctlz

via llvm-bugs llvm-bugs at lists.llvm.org
Sun Nov 18 20:53:11 PST 2018


https://bugs.llvm.org/show_bug.cgi?id=39703

            Bug ID: 39703
           Summary: unnecessary bit-and in pshufb vector ctlz
           Product: new-bugs
           Version: trunk
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: danielwatson311 at gmail.com
                CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org

For SSSE3+, LLVM's ctlz generates a generic algorithm which uses pshufb to
calculate the leading zeros for each nibble of the vector.

pand instructions are used to select the appropriate high or low nibble.
However, for the lower nibbles this is unnecessary because the algorithm later
performs something like `nibble_lzs = if high_nibble != 0, then high_lz, else
high_lz + low_lz`. The value of `low_lz` is only used when the high nibble is
zero and thus the bit-and is unnecessary.

https:://godbolt.org/z/4lkksq

for v16i8

    pand    xmm3, xmm2 # lo_nib & 0x0f, unnecessary
    pshufb  xmm4, xmm3 # lo_lz
    psrlw   xmm0, 4
    pand    xmm0, xmm2 # hi_nib
    pxor    xmm2, xmm2 # zero
    pcmpeqb xmm2, xmm0 # hi_nib == 0
    pand    xmm2, xmm4 # if hi_nib != 0, set lo_lz = 0
    pshufb  xmm1, xmm0 # hi_lz
    paddb   xmm1, xmm2 # hi_lz + lo_lz

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20181119/d92351e2/attachment.html>


More information about the llvm-bugs mailing list