<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/64048>64048</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            aarch64 splits one NEON shift into two (missed optimization)
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          neldredge
      </td>
    </tr>
</table>

<pre>
    Given the following aarch64 NEON intrinsic code:
``` C++
#include <arm_neon.h>
void vectorized(const uint8_t* pCSI2, uint8_t* pBE, const uint8_t* pCSI2LineEnd)
{
    while (pCSI2 < pCSI2LineEnd) {
 uint8x16x3_t in = vld3q_u8(pCSI2);
        uint8x16x3_t out;
 out.val[0] = in.val[0];
        out.val[1] = vorrq_u8(vshlq_n_u8(in.val[2], 4), vshrq_n_u8(in.val[1], 4));
 out.val[2] = vorrq_u8(vshlq_n_u8(in.val[1], 4), vshrq_n_u8(in.val[2], 4));
        vst3q_u8(pBE, out);
        pCSI2 += 48;
        pBE += 48;
    }
}
```
For the `vshrq_n_u8`, instead of the obvious `ushr v4.16b, v1.16b, #4`, clang emits
```
ushr    v4.16b, v1.16b, #1
ushr    v4.16b, v4.16b, #3
```
This does not happen with a function consisting only of `vshrq_n_u8(x, 4)`, so it apparently depends on the context somehow.

[Try on godbolt](https://godbolt.org/z/z1b7z67zr)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyMVEuvozgT_TXOptSRMYTAgsXN69MnjXoW0_vI4ErwyLFpuyD35tePDOQ-c6VG5IF96pxT5aJkCPpsESu22rDVbiF7ap2vLBrlUZ1xUTv1Uv1PD2iBWoSTM8ZdtT2DlL5p8wx-7v_-CdqS1zboBhqnkKVPjO8Yf2I5n27YMrGJ97QsUm0b0ysElm6lvxwtOrtsWbqfAIPTCgZsyHl9Q8VE0TgbCHptqTgSE0_Qbf_5v2Bi-2Fts48rj7F_aYt7q5goZxPr2Q0AwLXVBoGJYoRGV5-D4A0_Mj8n-XN6JNAWWLqDwaj097Ev7hRRJn0nEK8Pca6nN4DraTlIw1Ybzla7kVDbdytfqN4CknvA4LyfHQyhNb-Pdnp4JRKRSGwhi97EFobQ-q-o5AOqfORR_LFk8keS4jvJ-RoCvdZ2OuFYvAfA-fDEJprLiq_7m_03u2y9u3fF7lPrTo8H58f-Zzl_l0TOoxttA6FU4E4jxNWDdn2I0D60HoZsmeT1mH5y_8dEms3RjZH2DHjRFB4qjxyxCo9pkm9R2TtU-pD6V6sDKIcBrCNoZdehhaumFiScetuQdnZ8m3Sg-Mo7a15ilh-LIIrn19ObUgoONIHsOunRknkBhR1aFcBNQ6RxlvCZILgLtu66nM1N36vNL_8SkWenamdobI6iJepCnCviwMRh3lo6f2bicIufpF7f8vXNM1EuVJWqMi3lAqskL3mScSH4oq3SpsR6LVayyJNCcETZrGqllKylUA3PFroSXKR8LdKkFGnGl7xs1qda5KecZ7KoOcs4XqQ2S2OGS5Rf6BB6rPKMZ8XCyBpNuI9SX0XQj7o_B5ZxowOFtzDSZLC6z9DQGU2xPDiN09DqUxwt5ICuLs6liw4BFbiO9EXfZDyZmGnvTfWpNJravl427sLEIcrNPz867_7Fhpg4jJYDE4fR9X8BAAD__8tJvHo">