<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/154324>154324</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Vector constants in NEON intrinsics are needlessly reloaded
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          rdoeffinger
      </td>
    </tr>
</table>

<pre>
    When a constant in an intrinsic is loaded in a conditional branch, it gets re-loaded over and over even when the load is unavoidable and there are plenty of registers.
(Almost) minimal reproducer that can be checked with e.g. godbolt.org :
```
#include <arm_neon.h>

uint8x16_t test(unsigned a, unsigned b, unsigned c, unsigned d)
{
    uint32x4_t fact = vcombine_u32(vcreate_u32(0x123456789012345ull), vcreate_u32(0x67890123456ull));
    uint8x8_t tbl = vcreate_u8(0x0f0b07030f0b0703ull);
    uint8x8_t resa = vdup_n_u8(0);
 if (a) resa = vqtbl1_u8(vreinterpretq_u8_u32(vmulq_u32(vdupq_n_u32(a), fact)), tbl);
    uint8x8_t resb = vdup_n_u8(0);
    if (b) resb = vqtbl1_u8(vreinterpretq_u8_u32(vmulq_u32(vdupq_n_u32(b), fact)), tbl);
 uint8x8_t resc = vdup_n_u8(0);
    if (c) resc = vqtbl1_u8(vreinterpretq_u8_u32(vmulq_u32(vdupq_n_u32(c), fact)), tbl);
 uint8x8_t resd = vqtbl1_u8(vreinterpretq_u8_u32(vmulq_u32(vdupq_n_u32(d), fact)), tbl);
    uint8x8_t resab = vreinterpret_u8_u32(vzip1_u32(vreinterpret_u32_u8(resa), vreinterpret_u32_u8(resb)));
    uint8x8_t rescd = vreinterpret_u8_u32(vzip1_u32(vreinterpret_u32_u8(resc), vreinterpret_u32_u8(resd)));
    return vcombine_u8(resab, rescd);
};
```
fact and tbl get loaded inside each if. MSVC does not have this issue (though gcc does), and scalars also seem to not suffer from this.
(side note I will not create a ticket for for now: constant propagation seems to be unable to handle 0 input to vtbl, resulting in code like:
```
movi.2d v1, #0000000000000000
tbl.8b v0, { v1 }, v0
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJykVk9v47gP_TTKhZjAluPGOfiQaafA7_CbPSwwewz0h7a1laVUot12P_1CitOks9OdHTRIYDF-enwkZdIiRtM7xJbVn1l9txITDT60QXvsOuN6DCvp9Uv7x4AOBCjvIglHYBwIB8ZRMC4aBSaC9UKjzncSThsy3gkLMginBsZvwRD0SBECflrAfsYAwi0LnNHBU_JEA2a-xDs5MXujhbSYoTRgQBAB4WjR0Qv4DgL2JhKGuGbFnvFmb0cfifEdjMaZUVgIeAxeTwoD0CAIlHAgEdSA6gE1PBkaANf9Gnqvpbe09qEHVu0T302xfBN3ZZyyk0Zg1a0I48Ghd-uBVV_S3WI_GUfNc3lzICBMEprJ5RRrECkHr5Z8Y6k3lmZ8l-i2n1mxBwBIrBV_3hwIOqEIWHUHs_KjNA4PU8UZb2YVUNDZKp5LXm3qm22zK_JqsjaR8lv4HnjB3JxBO1Zde26emxSOtIvfZX-TtxddIYttUZ2vC8cPCAJGcWLQ0_HgFoIL2HTAeCNS1S7QR5K2PEHngMYRhmNAejxMzTnwcbKP57Wejo-JOltiiTilbImL36Y43hco_00gwKJRLhrlxzXKn2t8I1D9J4FqEag-LlD9okD9cZf61-smllJcubry9Jc5luf1G0TFTxITw_nxeA8gz1Le06D0RzWon2nQP9AQkKbgrtrBOaDcYbKuVzzb3i2Lq56WO0rurNKmBn3p5NFoBBRqANOt4f-_f7sF7TGC8wSDmBFoMBFMjBOmU0eDn_oBeqUybAkmMUclrAgRhI0eIuII5DNLnLoOA3TBj5ls6d_ZsfOE8D94MtZm7KnvgAAy6gEJOh_yz_knVu0vs-kY_FH0Io2f7CsmZxLTIEkzhDwMwmmLUIBxx4nSP3M-XTldkyXj-jTHlNcI1jzgP-fA6Gez5hrmMu1ivCq--7BiT9KuGwlzkSHbzzCXkAqQClxc8610W-ldtRMrbMttXd8UTVHuVkNb11ph2ahKlKUUqq4Fb0S52XJEUVZiuzItL3hdNOWu5HVV7day3m0KXdel7MqdrDXbFDgKY9fWzmOaaatcrbasNxXfrKyQaGMe_pw7fDrVknGe3gVCmzZ9klMf2aawJlK80JAhi-03VOTDa-pjytrXL799vbwZxDypHaK2GKN9gYCn47Wagm0HomNM2eX3jN_3hoZJrpUfGb9PnpbLp2Pwf2LqBfdZX2T8fglgbvnfAQAA__8nwMDK">