<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/54737>54737</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [AArch64] Use STR instruction when storing lane/byte 0 of vector
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            good first issue,
            backend:AArch64
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          ilinpv
      </td>
    </tr>
</table>

<pre>
    ```
typedef char v8qi __attribute__ ((vector_size (8)));
typedef char v16qi __attribute__ ((vector_size (16)));

void store_lane_v8qi (v8qi x, char *y) { y[0] = x[8 - 1 - 0]; y[3] = x[0]; }
void store_lane_v16qi (v16qi x, char *y) { y[0] = x[16 - 1 - 0]; y[3] = x[0]; }
```
Currently with -O2 AArch64 LLVM generates:
```
store_lane_v8qi:                        // @store_lane_v8qi
        add     x8, x0, #3
        st1     { v0.b }[7], [x0]
        st1     { v0.b }[0], [x8]
        ret
store_lane_v16qi:                       // @store_lane_v16qi
        add     x8, x0, #3
        st1     { v0.b }[15], [x0]
        st1     { v0.b }[0], [x8]
        ret
```
When storing lane 0 we can use the STR instruction that has more flexible addressing modes. LLVM already does the right thing for wider types, but fails to do it for vectors of bytes. So we can get better code which GCC has already managed to produce:
```
store_lane_v8qi:
        st1     {v0.b}[7], [x0]
        str     b0, [x0, 3]
        ret
store_lane_v16qi:
        st1     {v0.b}[15], [x0]
        str     b0, [x0, 3]
        ret
```
https://godbolt.org/z/oz3qxfeTP

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy1VEtzmzAQ_jVw2YkHhMH2gYMfk17Saafp4-gRaAG1GDmScOz8-q5EPEnteOrMtIxAQtr9dr99qFDikAdZ9DyiVRDN7WGLAisoG65hN32QsF5za7UseovrNQRsSmOHpVV6beQTuh3amj2PZPEWTpxdBxRn50jDd6ekAEOyuG55h2vvmgNw8z5gy8FSwOYHUoZgsoBDkC6iIF1BkKxIJF1M4QZiet0mwXuB5LXA8SCYrC5Y9UScWb-40m6cvdvwSVaWvdbY2fYAj9I2cPOJwXyuyyYbw93d949QY4eaWzRBMn8T4CRyJAYXnoDd0oBgHJ3qeKCjGBfCz_upi8E-ct-AJcmfUsbGAyiFZReNCk8wXUwcXaeQLvae-jVK0Sul6ZmSRnvO1CXpMtW3mXqdf0c1Tv8j15Mk_2iw89UquxocG4jgEaHkHfQGwTYI91-_gOyM1X1ppepoj1touIENRQCqFveyaNFR1miMw9kogWY0lBlvNXJxAKHQeDgt68bSyglWSlN1CtTget8496nVoeKyJWFFSiCtlxqa3oCqoDhYh36vjo7WaKFAawmmJMvw2MiygQ_Lpffy6MCGd7xG4WC3Wom-xOsL_2ICXPyvq0_t5yJ6kaFF8q6avMKLv5bOe904CU1j7dZfGL4PaiUK1dqR0jX9PdGrnpKHfYVfPw_iocgTMUtmPLTStpiTxec7yN1k38x5cT2e1iOBuoRTWVLqhyoIe93mJ57QFdcXo1Jt6Kdtd8fphlL9k5ToVxrTuxK7TceTZBI2eRTPoiqrZhx5llYpMhEXRcH4LGU8SnkWtrzA1jinA8ZqpQRUUhsLHom2hqZmBS9_YSfIlyM3OkpXocxZxFg0jsZxHDM2HU2yUjAhKlqXPM5mdIvghkp95Dx1QQx17p0u-trQYSuNNS-HnJqr7tBH0eHz3jZK57KV3XYXeqdyz-03180dwg">