<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/55351>55351</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [LV?][x86] Not perfect vectorization for bit sequence unpacking
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:X86,
            loopoptim
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          LebedevRI
      </td>
    </tr>
</table>

<pre>
    Suppose we have a packed vector of, suppose, `i14` elements, and we want to zero-extend to `i16` elements.

At least for little-endian, effectively, loading i64 (i.e. 8 bytes)
and then extracting highest (7'th) byte is the same as
offsetting the byte pointer by 7 and loading a single byte:
https://godbolt.org/z/zqb5Wjf5E

So here's a naive scalar way to extract 4 consecutive `i14`'s: https://godbolt.org/z/xcjjv9bhG
Or, with an actual outer loop: https://godbolt.org/z/3Tnb7nW1x
LV did it's thing, the result is not terrible.

But we could greatly improve the loop body: https://godbolt.org/z/PMr3ax969
Here's the generator for the improved codegen: https://godbolt.org/z/8Ps98Msh4
(i initially looked at this as plain `*ext` isel legalization, but that's impossible.)

Maybe after VPlan and outer loop vectorization, LV could improve.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyNVMFy2zgM_Rr5golHpizbOuiQNs22M8luZttJ9wpKkMUsTaok5dj5-gUkZ5vuZTNjSwJJPDw8ANS-Pddfx2HwkeCZoMcjAcKAzd_UwpGa5AP4LlMfIc6n5DPb5Ga15ieQpQO5FGUVXSsQz-gSJA8vFPwVnRLxMpuTz-atzzLLb7L8en5eJ7CEMUHHAa1JydIVexp0Ak1dx1TMkexZTOuxNW4PZrOGTO3MkpawA31OxESqGVDYpJ4cMIWA7Mzne7PviWOwzzZT29Tz6ckNTJTDEPHA6ccZwXddpDQ5yt50bvDGJQpswHZK-JUKQuSXnY9lxSWrPqUhiqVu-bf3rfY2LX3Ys_Ui_x-6_P7UlZ_eSvHVQ0-Bhd5GhnXIaUNs0GJgbc-i5SUlWEPjXaRmFGl-VkU8OSj8X_RT8_R0rHT_2xz3jyDaPpvUc2bA-CNa8KOka70f3oNYfHN6676vTjPi3SO0pgWTplxSzwpJCFEzUBxtEt2d53ahEIy29EtLfBiT9FPjR9vCPhAmewZzGILnZAVDWIHmDn4PtYf7UOCp2lza4_OrwgK0J0cBpdWl-2TlEqbl6C3x9nsi7B5itbuP_fqSBDcmGGeSQcvEmayMFCbRgQsbYbBoHEz1uuaKymiYSJbnYI_WvGAyfmp-PYoPziIyMR_jrNVrq8_Pezxrbt5O6vX4YKWE3J8_63eZ5jfAXJ5Z3Eu2y0VbF21VVLhIhgewzsoPd49ZcZuVN_x52m34A37neg0UZCB_hZzE0yZBpB8juYZgdHKRcNUXY7D1f_TjPhv1svEHNqw9vr6umMoTw7JpYhxlom_LsihXi77OtaoIm7KqdN6u1aZS2OWqWGFerXCD5cKiJhuFd6aUlkvMtRzxL2au1HRzKSVa-CGZgyyVNwtTq1ypvGSMsljl5XKzZSidE1XVWqlSZ-ucDmjsUuhJvRehnpjqcR9505rIl9m_m8jl2Tua1BN8HFPvQ31Hmlo6_vllMaVVTzn9A2eBxxw">