<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/62365>62365</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Improvements to buildvector codegen
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
preames
</td>
</tr>
</table>
<pre>
Looking at the examples below, we've got a couple of possibilities for ways to improve generic buildvector codegen. Please take the follow as a list of ideas; not all of these may work out. Note that I'm also talking about the generic case with no repeated elements, etc..
For vectors with power of two lengths less or equal than 64 bit, we can do shift/or on the scalar side + a single scalar-vector move. This may require a VTYPE toggle, but that's likely cheaper than a series of inserts.
For vectors with power of two lengths greater than 64 bits, we can group into 64 bit chunks. This reduces the number of vector instructions and I to V moves, at the cost of extra scalar work.
We should be able to use either vslide1up or vslide1down. If we can exploit the undefined tail property, we should be able to do this without individual VL toggles between inserts. Note that this requires undefined tail, *not* simply tail agnostic. Combined with the above, we should have one vsetvli + VLEN/64 inserts.
Note that the case where VLEN=128 is particularly important - as it is the minimum guaranteed by V, and thus what SLP is able to target by default.
```
$ cat buildvector.ll
define <2 x i32> @buildvec_2xi32(i32 %a, i32 %b) {
%v1 = insertelement <2 x i32> poison, i32 %a, i32 0
%v2 = insertelement <2 x i32> %v1, i32 %b, i32 1
ret <2 x i32> %v2
}
define <4 x i32> @buildvec_4xi32(i32 %a, i32 %b, i32 %c, i32 %d) {
%v1 = insertelement <4 x i32> poison, i32 %a, i32 0
%v2 = insertelement <4 x i32> %v1, i32 %b, i32 1
%v3 = insertelement <4 x i32> %v2, i32 %c, i32 2
%v4 = insertelement <4 x i32> %v3, i32 %d, i32 3
ret <4 x i32> %v4
}
```
```
$ ./opt -S buildvector.ll -O3 | ./llc -mtriple=riscv64 -mattr=+v
.text
.attribute 4, 16
.attribute 5, "rv64i2p1_f2p2_d2p2_v1p0_zicsr2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0"
.file "buildvector.ll"
.globl buildvec_2xi32 # -- Begin function buildvec_2xi32
.p2align 2
.type buildvec_2xi32,@function
.variant_cc buildvec_2xi32
buildvec_2xi32: # @buildvec_2xi32
# %bb.0:
vsetivli zero, 2, e32, mf2, ta, ma
vmv.v.x v8, a1
vsetvli zero, zero, e32, mf2, tu, ma
vmv.s.x v8, a0
ret
.Lfunc_end0:
.size buildvec_2xi32, .Lfunc_end0-buildvec_2xi32
# -- End function
.globl buildvec_4xi32 # -- Begin function buildvec_4xi32
.p2align 2
.type buildvec_4xi32,@function
.variant_cc buildvec_4xi32
buildvec_4xi32: # @buildvec_4xi32
# %bb.0:
addi sp, sp, -16
sw a3, 12(sp)
sw a2, 8(sp)
sw a1, 4(sp)
sw a0, 0(sp)
mv a0, sp
vsetivli zero, 4, e32, m1, ta, ma
vle32.v v8, (a0)
addi sp, sp, 16
ret
.Lfunc_end1:
.size buildvec_4xi32, .Lfunc_end1-buildvec_4xi32
# -- End function
.section ".note.GNU-stack","",@progbits
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysV1uPozgW_jXOy1EQGEIlD3moS_eqpVJvSzNbq30qGXwAbxmbsQ1J-tePbEiKXKqnZ9SlEhxsn9t3Ph_HzFpRK8QtWT2Q1dOC9a7RZtsZZC3aRaH5Yfus9ZtQNTAHrkHAPWs7iRYKlHpH6CPskNC7AaHWDhiUuu8kgq6g09aKQkjhBFqotIEdO1hwGkTbGe01UKERJRS9kHzA0mkDpeZYo4rgm0RmERx7w-C40lLqHTALDKSwzrsQHJkl6QMo71tKP-YatAgtO8BOmzfQvYvgq3beCHPwhdC7Fpi0GhyTY2KF7sfcjvGU3vFOuAaUBoMdMoccUGKLylmfM7oyikj8ROL78flZGxgzsKNmp3doQjw7DRJV7RoLEq0FbQD_6Jn0ASnIMyiEG3GEkingGmwjKkeot6lViMyWTDIDVnAEQh-AgRWqlseJ5QReqweMAH5vhA0IGPyjFwaBwcvv__v2CZyua4neWRFSZo7QOwtSvKE8QNkg69CMcTGwaHzhPMzKonH2HyRcG4-dmadqZ7nWRvcdCOX0NAll06s3e8zBIO9LtAEC1bfF6GBKVijrTF86oZUFpjh88dx6CSAEJxNjSz2SBffOsCOSnhtn-fwXwTa6lxwKBFZI9MZ6i4DCNWhgsFJwTPrO12_64HqnIoAv1TEf3HdSi9FtrzhWQiEHx4SEzugOjTtM2V_74hqcz9mD6QkpFBeD4J4oL89T5fyucztEdSoJzLjtRshCye2Ff--W0HulHaH3YEXbycMYGKuVtk6UEcCjbougEQrqk2CFHvA85IYNCFohDBbdIEXg48vzp6-Efs6zWWBzdOdB4rS_GjQ4KqZPCV2DsNAx40TZS2bkwXcJbRxTDpZ-1wvnV3j1VijR9i3UPTNMOUQOxQFeQskVB9f0Fnbe12_P37zOEWHHTI3Or-VYsV66iyBJHk__4yfNoGRu3p0iKc80RoSBpI8U9iBSStJPQLL4qPJK936QrkVKgdAV8zFOckHoBsjdw2gJ_NCQAEmfJgindnNhvNPCajUzczIZzw3RvzQU3J2HM8rJ0Y7BWzp0yv_u6TYQ2W0gsh8CcZLLmcx_HqDsVwGU_R2A_Ir0p-zQWynSuZ3sp-yk5_CMcnpRsAud7GbBLrn-0QaI_DHUOVj-drkPlv9Ogdw9hhVSlrBsnRGdRJI-GWHLIc9g2TLnDEmfCH0Yjo42kcO9e__yS0TROyTxJvM5JfntydXYw6gZ8kzQLnmtaEdfuX8MSRe_fhelNdQLA6a0So7SfpLyjJ-k6iRNszKh62ISU3qU8sxLhNL3iCohfTCE0nM8zhbVUheSxJvzPgCEprBcwgPWQkHVq3B2wUWzOBnpKJOiViTezAbdocMrw4Q-kiw-GnxfPDAjmHKvZXmtEhZdjKX3IcTrBjbRIQ37oIhikt6fvPhjQAxSkHjzHY32RQp0xxAXtFV4ubANW_au1g7REO29tA6dOzmzeG7w-L6w2d-waec2T-zeGJwoFz17nF5R8bMsIiu-3wQWZgrLW7DAT_5Nxf-kOFxX6oowoV9-aOQjBmX_gEHZ32dQdoNB2UcMyv6SQYxzX2zbebjH53LWBOzOrwmtL_FHiF-xuZgNlVp_MBm6ePbBZOwn46vJdjhN2u5HXM_mvExuU11iSqPhREtC197y5kcAzPK_Qd3kh9TNrqmbLG_V45dQ1-I04HtipLTD6F9f_7O0jpVvvinSx_CcKNYZXYdLwPmBs-DblG_SDVvgNsnXNFslaZosmu1dxtN8U-aMVnGxrjiu0zwp10VRFetNSpOF2NKYpnFGV8kmXaVJlNN4tV7n-d2KI655RrIYWyZkJOXQRtrUC2Ftj9ucpvlqIVmB0ob7L6UKdxAmfbirp4XZep1l0deWZLG_dNp3K044idsv41U2XAz9b8wbV9lFb-S2ca6zvmr0M6Gfa-GavohK3YbDczi-lp3R_8fS3_1CHNb_pvZx_hkAAP__jb2ruA">