<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/55793>55793</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Fixed-width splat goes through stack instead of using SVE DUP instruction
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:AArch64,
            llvm:codegen,
            SVE
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          sdesmalen-arm
      </td>
    </tr>
</table>

<pre>
    When compiling the following example with `llc -mtriple=aarch64 -mattr=+sve < t.ll`:

```
define <8 x float> @foo(ptr %p) vscale_range(2,2) {
   %v = load <8 x float>, ptr %p
   %splat0 = shufflevector <8 x float> %v, <8 x float> undef, <8 x i32> zeroinitializer
  ret <8 x float> %splat0
}
```

The splat is performed with a stack spill + reload operation:

```
foo:                                    // @foo
        stp     x29, x30, [sp, #-16]!           // 16-byte Folded Spill
        mov     x29, sp
        sub     x9, sp, #48
        and     sp, x9, #0xffffffffffffffe0
        ptrue   p0.s
        ld1w    { z0.s }, p0/z, [x0]
        stp     s0, s0, [sp, #24]
        stp     s0, s0, [sp, #16]
        stp     s0, s0, [sp, #8]
        stp     s0, s0, [sp]
        ld1w    { z0.s }, p0/z, [sp]
        st1w    { z0.s }, p0, [x8]
        mov     sp, x29
        ldp     x29, x30, [sp], #16             // 16-byte Folded Reload
        ret
```

This can be done more efficiently with a single indexed DUP instruction, e.g. `dup z0.s, z0.s[0]`.

This is extracted from issue #55438.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyVVUuT4yYQ_jX4QlmFQfLjoMN4vT6nsnkcU0i0JBIkVIA8nvn1adB4dvzYTUalEqLf_XXTVFa9lH92MNDa9qM2emhp6IA21hj7HHdwlv1ogD7r0FGyZsbUdNkHp5FIxEFKV3frHEkyBIcEwvf-BJSILzRkxqAGEU-EHQi7fJEyv2mroNFDkt_SM22MlYGIr5TkrLGW8O0YHCW8GAnf0ZOvpYG_nBxaQBYn_AuPdLLZz8YojbIntHagaEndmkUF-m7wg4YfjQwsqfluahoDJ6iDdfdhofVo5JY-DZjHB4YWPJJfwVk96KCl0fh_cekgPLI8R_GG0ubwEK75-xuWKElT7ekIrrGuBzXXSFIfZP0P8rUxaHaP7hIWFgVl0Hb4eUEi7CLi8p8P4Ud8L6W6wJkeH8a0nvkuonIWLIFT7P2YfrhYrtakwG5Z3RtcrZfVSwB6tEZhVt9iItfme3v6aN6PN96namZfuLPLfHstJgc1iyeBWRjF2Lm5eoBdq2EDTRBXlvlrjlGr55THZk9fkUtjFWPLYfbH1zcEzizm_RAtn1Dyd1jx_NMqM7qfUtl-QuNW8n9l_kDPhx_rzWjdR3Up_lvZsAVuQvlx6xWHd3jox-dx4_2azs21dTy6Pz2XeB5rOdAKqLI41nrrgELT6FrDEMzL-xHF0YpDVePYOKOnw--_4L_HxqrTAcUgIWuzOG_VNCZkIi2txT410Jpld57xhXNwsg5os3G2R4rHXsWEiyIX22yhSqF2YicXQQcD5VGj9-WzVhjUPE9aCx5vAGentnsbJDEwiPOjoVOMm3774-ttxIvJmbILYfRxuCQ0W8x0qjK8V3BjzOmyLEdn_8bpitsUHWZ2LIrNTiy6sq5Zsc4rJblooOCb3VblilWwUbViwPKFkRUYXyIIhPMKo4NBocenp3QLIW2uL0_-xFNtFbQwfKdj6HFTHBa65IxzVojVChfBslzCVkBeyA3kcsUaHGvQS22yaCuzrl24MsVfTa1HptE--O9M6b1uB4AUG9qXU-isK70C3-OlNSyl6xcp4TJl-y_qjQoz">