<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/140729>140729</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [X86] Use GFNI for LZCNT vXi8 ops
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            good first issue,
            backend:X86
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          RKSimon
      </td>
    </tr>
</table>

<pre>
    We can perform a LZCNT for v16i8/v32i8/v64i8 using GFNI instructions

As detailed here: https://gist.github.com/animetosho/6cb732ccb5ecd86675ca0a442b3c0622

Even though its based on the TZCNT implementation its easier to start with LZCNT as x86 vector TZCNT instructions all currently expand to CTPOP patterns with TargetLowering::expandCTTZ and that might need some messy legality cleanups.

e.g.
```c
__m128i _mm_lzcnt_epi8(__m128i x) {
        // just reverse bits and TZCNT
        __m128i a = _mm_gf2p8affine_epi64_epi8(x, _mm_set_epi32(0x80402010, 0x08040201, 0x80402010, 0x08040201), 0);
        a = _mm_andnot_si128(_mm_add_epi8(a, _mm_set1_epi8(0xff)), a);
        return _mm_gf2p8affine_epi64_epi8(a, _mm_set_epi32(0xaaccf0ff, 0, 0xaaccf0ff, 0), 8);
}
```

```asm
.LCPI0_2:
  .byte 1 # 0x1
  .byte 2 # 0x2
  .byte 4 # 0x4
  .byte 8 # 0x8
  .byte 16 # 0x10
  .byte 32 # 0x20
  .byte 64 # 0x40
  .byte 128 # 0x80
.LCPI0_3:
  .byte 0 # 0x0
  .byte 0 # 0x0
 .byte 0 # 0x0
  .byte 0 # 0x0
  .byte 255 # 0xff
  .byte 240 # 0xf0
 .byte 204 # 0xcc
  .byte 170 # 0xaa
_mm_lzcnt_epi8(long long vector[2]): # @_mm_lzcnt_epi8(long long vector[2])
# %bb.0: # %entry
  vgf2p8affineqb $0, .LCPI0_2(%rip){1to2}, %xmm0, %xmm0
  vpxor %xmm1, %xmm1, %xmm1
  vpsubb %xmm0, %xmm1, %xmm1
  vpand %xmm1, %xmm0, %xmm0
  vgf2p8affineqb $8, .LCPI0_3(%rip){1to2}, %xmm0, %xmm0
  retq
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyUVd2O2zYTfRr6ZhCDGv36whda7-cPQYM0aLdokBuDkkYyU4lUSMqx-_QFKdur7CZFAyxW4uHMmXPMoUZYKztFtGXpA0sfV2JyR222v_3yuxy0WlW6uWz_JKiFgpFMq80AAt592r1_glYbOEWZLBjuTzHOzyyRBUxWqg7-v3__FqSyzky1k1pZxkvGy9JCQ07Inho4kiEWl3B0brQsLhnuGe47ad26k-44VetaDwz3QsmBnLZHzXCf1VUeY11XKdVNkWV5WgsukgSruOYZ4lzmfydS4I566o4gnYVKWGpAe4zgKRiQw9jTQMoJLy9EkbCSDDgN1gnj4Kt0x6tdYeFcZHCi2mlzY1i4A9H3UE_GkHL9Beg8CtV4pt3Th18_wCicI6PsTPkkTEfunf5KRqrOO4_LOWP39PQJQuZROBhkd3SgiBqweiAYyNoL9NSJXroL1D0JNY12PXumdRfeMj7_1YyXh8MQYSHhMAyH_u9auQON_qiK28aZ4QZY_hAoNvMJwOfJOjB0ImMJKv_LeEnB9Bx3yxbA4sdA3rU4FqJtpSJfIktuhc4MdyHCUigeI8OCnwuecOQR97v8zK_LefWjvU0A_DO-6n2uL1SjtDtYGWGw56GmuYkQCxHRDeTntg2kgVcseQ25yah_Nya-b0yIum65Zw5avYMXUChXPJfLH5enNp_lfSnswHi5frf78JYf0LcKLwHW1cURRMAwBn6OFhheMVxgyRVLFlhxxYolX3Yj5As0vlMu0exOukQjvPPyZ9nxt7L5NYT_CPuPYTfHaXpF23YJJ7fgdkGK_Ka7rpe681uwEP7WvLwtvVYdhH_zB4ClD8jSx3CGZchkCf-ZLF6GJEyras3vHJiScuYShJ0WjfelAoZJ6KZ7J_g2T40cPVv-EDmNvpFw51nOw8CXr4FvPGtzRaLnzW9e5zg7VdVrlu8E-o_Cq83XhV8ZKRZG4p82Ysh9Wd6QVbONm028ESvaRnmS50nOs2h13KbJRnDM0xRryopI5MjjtsKUN0VDnHAlt8gx5SnyKI7yJFqnScvTWFQYUZ00m5olnAYh-3Xfn4a1Nt1KWjvRNkp4jptVLyrqbZidiJ3WDbTSWAchiCEy3DHEStR_kWpYXH4sMo-mjyuz9YxvqqmzLOG9tM4-13DS9WEi-_j0Ef6wNE9TP3LnaXT6KAvQo11Npt--HJ-Lyek5r483o9GfqXYM90GfZbi_-jht8Z8AAAD__63xU2c">