<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/107065>107065</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[Aarch64] A shufflevector that shifts vector elements to the right emits a `tbl` instruction rather than an `ext` instruction
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Validark
</td>
</tr>
</table>
<pre>
This code:
```zig
export fn shiftElementsRight1(d: @Vector(16, u8)) @Vector(16, u8) {
return @shuffle(u8, d, @as(@Vector(16, u8), @splat(0)), [16]i32{ -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 });
}
```
Gives this emit when targeting Apple M3:
```asm
.LCPI0_0:
.byte 255
.byte 0
.byte 1
.byte 2
.byte 3
.byte 4
.byte 5
.byte 6
.byte 7
.byte 8
.byte 9
.byte 10
.byte 11
.byte 12
.byte 13
.byte 14
shiftElementsRight1:
adrp x8, .LCPI0_0
ldr q1, [x8, :lo12:.LCPI0_0]
tbl v0.16b, { v0.16b }, v1.16b
ret
```
LLVM emitted by Zig (`zig build-obj ./src/llvm_code.zig -O ReleaseFast -target aarch64-linux -mcpu apple_latest -femit-llvm-ir -fstrip`):
```llvm
; ModuleID = 'BitcodeBuffer'
source_filename = "llvm_code"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknown-linux-musl"
; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable
define dso_local <16 x i8> @shiftElementsRight1(<16 x i8> %0) local_unnamed_addr #0 {
%2 = shufflevector <16 x i8> %0, <16 x i8> <i8 0, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <16 x i32> <i32 16, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14>
ret <16 x i8> %2
}
attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable "frame-pointer"="none" "target-cpu"="apple-latest" "target-features"="-a510,-a520,-a65,-a710,-a720,-a76,-a78,-a78c,-addr-lsl-fast,+aes,-aggressive-fma,+alternate-sextload-cvt-f32-pattern,+altnzcv,-alu-lsl-fast,+am,+amvs,+arith-bcc-fusion,+arith-cbz-fusion,-ascend-store-address,-b16b16,-balance-fp-ops,+bf16,-brbe,+bti,-call-saved-x10,-call-saved-x11,-call-saved-x12,-call-saved-x13,-call-saved-x14,-call-saved-x15,-call-saved-x18,-call-saved-x8,-call-saved-x9,+ccdp,+ccidx,+ccpp,-chk,-clrbhb,-cmp-bcc-fusion,+complxnum,+CONTEXTIDREL2,-cortex-r82,-cpa,+crc,+crypto,-cssc,-d128,+disable-latency-sched-heuristic,-disable-ldp,-disable-stp,+dit,+dotprod,+ecv,+el2vmsa,+el3,-enable-select-opt,-ete,-exynos-cheap-as-move,-f32mm,-f64mm,-faminmax,+fgt,-fix-cortex-a53-835769,+flagm,-fmv,-force-32bit-jump-tables,+fp16fml,-fp8,-fp8dot2,-fp8dot4,-fp8fma,+fp-armv8,-fpmr,+fptoint,+fullfp16,+fuse-address,-fuse-addsub-2reg-const1,-fuse-adrp-add,+fuse-aes,+fuse-arith-logic,+fuse-crypto-eor,+fuse-csel,+fuse-literals,-gcs,-harden-sls-blr,-harden-sls-nocomdat,-harden-sls-retbr,-hbc,+hcx,+i8mm,-ite,+jsconv,-ldp-aligned-only,+lor,-ls64,+lse,-lse128,+lse2,-lut,-mec,-mops,+mpam,-mte,+neon,-nmi,-no-bti-at-return-twice,-no-neg-immediates,-no-sve-fp-ld1r,-no-zcz-fp,+nv,-outline-atomics,+pan,+pan-rwv,+pauth,-pauth-lr,+perfmon,-predictable-select-expensive,+predres,-prfm-slc-target,-rand,+ras,-rasv2,+rcpc,-rcpc3,+rcpc-immo,+rdm,-reserve-x1,-reserve-x10,-reserve-x11,-reserve-x12,-reserve-x13,-reserve-x14,-reserve-x15,-reserve-x18,-reserve-x2,-reserve-x20,-reserve-x21,-reserve-x22,-reserve-x23,-reserve-x24,-reserve-x25,-reserve-x26,-reserve-x27,-reserve-x28,-reserve-x3,-reserve-x30,-reserve-x4,-reserve-x5,-reserve-x6,-reserve-x7,-reserve-x9,-rme,+sb,+sel2,+sha2,+sha3,-slow-misaligned-128store,-slow-paired-128,-slow-strqro-store,-sm4,-sme,-sme2,-sme2p1,-sme-f16f16,-sme-f64f64,-sme-f8f16,-sme-f8f32,-sme-fa64,-sme-i16i64,-sme-lutv2,-spe,-spe-eef,-specres2,+specrestrict,+ssbs,-ssve-fp8dot2,-ssve-fp8dot4,-ssve-fp8fma,+store-pair-suppress,-stp-aligned-only,-strict-align,-sve,-sve2,-sve2-aes,-sve2-bitperm,-sve2-sha3,-sve2-sm4,-sve2p1,-tagged-globals,-the,+tlb-rmi,-tlbiw,-tme,-tpidr-el1,-tpidr-el2,-tpidr-el3,-tpidrro-el0,+tracev8.4,-trbe,+uaops,-use-experimental-zeroing-pseudos,-use-postra-scheduler,-use-reciprocal-square-root,-use-scalar-inc-vl,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,-v8.7a,-v8.8a,-v8.9a,+v8a,-v8r,-v9.1a,-v9.2a,-v9.3a,-v9.4a,-v9.5a,-v9a,+vh,-wfxt,-xs,+zcm,+zcz,-zcz-fp-workaround,+zcz-gp" }
```
It's weird because shifting left works just fine.
```zig
export fn shiftElementsLeft1(d: @Vector(16, u8)) @Vector(16, u8) {
return @shuffle(u8, d, @as(@Vector(16, u8), @splat(0)), [16]i32{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, -1 });
}
```
Assembly:
```asm
shiftElementsLeft1:
movi v1.2d, #0000000000000000
ext v0.16b, v0.16b, v1.16b, #1
ret
```
The LLVM emit is almost identical, except the shufflevector is permuted, obviously:
```llvm
%2 = shufflevector <16 x i8> %0, <16 x i8> <i8 0, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <16 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16>
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWduO5CjSfhrXDSJl40M6L-qiDl2_Wur5ZzXbGq32poVxOE03Bg9gV1Y9_YqDnemsqtmDRlqNtK0Z88VHEEAQhJ1R1Bh-lAC3SXmflI83dLK90re_UsFbqn_cNKp9uf3ac4OYaiHJ75L0MUmXZ5WG_175MTBwGpW2qJPI9LyznwQMIK35hR97myWkbpP8DiVF-iswq3RC6qxKyAOa6oQcEnL4qAsl-_swAUIIabCTlk7X9FPXCUhI7dQeUOseSZFSk5D6w2m8ihkFtQmp0zCzZ8v7rErKR56TZH-PcObI1D08Iu6Ru0fhHqV7eLN79_DzezNZGOLHZH5Q5kdlBUr2j26uPG7GiVtHXnr3__gMBlnnexi4Rc89SGSpPoLl8ojuxlEA-in_6EyoGQKz-_Lwl8_pt3RVRPHfrnmxgBAiZfleR_q-dvaBkffpPNKLXLyv9u4CEKrep_dXRuv31Q4fbODdjWUf7Cv7YGNZ_q6VuL_3ov_a_bTVo2tPPnTWU9roiFaj37IYnEExye-EykiSnw-2fNyOso3w7Zzusqrxg_b3UQox-IDmzPeFcRrs7wTily-__uRD0EKLmhf0d35E7n75e4-aiYsWq-Y72iXkyWiWkCch5uGbSxg7p4F_Rr-AAGrgiRqLcIhhRKlmfVVgweV0Qnhg44SoC-pvglpwip2bFDtjmGuEO2M1H93i3B16P-qdcqTye_STaicBnx9Rkj-ihOzvuXWrup-6DnRCYhwZNWkG3zouQNIBojJZN5GQGANx4S21VNAXNdlFFfCQ5HeAeZ3kd-7_nGCeVUl-5x9Oqookv6sKzDPiFDJSY5mTwP3VcVeTuK2KdS2Lryb5Q6pnGXyGh8mIddx510-TZJYrie6s1cZl3GEydtTqqMEYJFWnAZBUGtikjUPmRTIk1SSfuWzRMxci5tgBBqVfElJLJcEl4unZ0kZAmKyFjktArVHfhGJUoCR_yCp0Qs4Pn0KCfu8tcKVGSneiyJv4Nkl3Bu032rYaJSRPL3N_QkriXRIT_-zz-_W03t7DNZs_8Dpkc16jUXGj5J9UyD9ttueiKOwvJyi86BxKF5AtgCwgX0CxgHIB6_D9AuoFHFaDZ9Or7Ww1nq3Ws8ItNR6dBvv2nMj1e9A_qbWaN5MFEwPA3YH9_R8fxe5mdZoOgEfFpXUpgST5Y0JIUCROIdxHzMZp7fVZCocstdXqgNpJg1lVMS29uzAtSWir0jf7SO8jva9CU8eG-bZtNRZG4I66mR4Sck-d8QdMj94NfAbcDTR2CQtaUgvYwMkKRVvMZou7nOCRWte36slXNnszYrq2PyztbCLS3Pa4YQx3k-FKXrKseT2zmBoGssXGKg1-7WD8Ypusanxg4oYKKhngbsRqjPabLvbpBiJjuSMYFQIbOkOLT8FbGyZ7w5A3TP6GKd4w5RumvmbeEIewUMbacUG8PS1wHL1-_8M3Qjd949EwvnEjU8MoTnKKXn_4-f-_fvrb18-Pv3z6EnajtIUT1nWQxnjUzL1mA3gZrfJ9xviYaf3LxPW13Lgo94Eq2Qs2rIcW9zBpbiwPyouK38cqGjsuJmJYtMpdvTYI4IPHAUHmwdBF8N4GGUyAAGaxGq0nLfjm9CKVwawHOmJq8KBmz3c5GQYPqiICOnA50OjS7uitdPy0OISWOa7zcl_Fk-gEPYaBgw_sTmkGOCcNt_j7NIzYX_gYcd2YVd0gvN5Yx6ZVlpxhEeF6uboRUz3MUXvQC2td5ojCJIQzvUhmcwkW2UwNJhqOmClpbHbRpUfXfzl6XbCX_I0T6sjZBRsCAIPSl6QBcSEKbkFT4ZdxZL7pqW5BYiMMboS-YqRiamipvaI12CaoNnEFPYsHxOtwbNzGK_zdMCX9SYh2xFS4H5ktVlK8hH7h14uFqYpIGAgErPErDPgTEZNfyQA-ZIc1cQyjT1Z4WCaVEPKQHHz-kAo3lmNqcXgPYPvMGcQeCUfMhwFa7tJ4JM3sU5NoMx2ZV_aKu3gZwn7UZAWXgKlVA2dxKSOVK8D6eV6EyfZujAdYxCMaQXdDWOmooeXMXt4YOI0gXWKPyhpaHRY46m7ARrD4Ae0oTWUMGE1NIMxMIsNG7zDX5mfK7VpFsfX-02BAz4BP2VZKt-JVL9mK-VYstmK5FeuNuLVEttOS7bTkSnk7LdlOS7bTkmor7rfidlFbw_l2UdtptrNsJ9nOcfDSEI_WNLEFEY_M9PSM_AqMUM944Ga5QRmp_et17Rsp14FfKWP1b1rhs95QhCZK4Va5dswiwl1Wxdewl6qiq4pVqi-76i4nq0DPWjyr-FkSk52D2gixwQBdhEyDWTYaJKs5i2nUmMaHsgm3cU3MF3JxKa8pOnx3OIdgM43jkniNfZOAcJgv0F4O7yEzR9_MQGL2DbjhdgQ9rPJ6PF6I_p0Xh1p6PEKLj0I1MevaPp65FQ3WIT1Z0fBnD8LB2JG3GoPILgVyKeSroBUGkUaTmjKY651fhF0_oiYaEiV2bwCXVDR3v8KowK-gFZdHPBqYWrXqjMpYTcNXwiRAL7wGxkftfpxh89tENWCtlF16DaOCaswlw3N858z1LqMrJGeYn2FxhuUZVg7iud7tF1Av4LBqRcavbz6EqRwgC8gXUCygjGCx4ZPyc3fyuzjFHP7KhgW8Oj6kfvys9A-q1bQkWkcfR__Z_7vlu882IXuDnoHrFjXA6GQgVEW5PCIBnUXOtEHfJ2OR-yG9-49qq1-g-9OVVv-geqp_-nE4-_dqq3fGwNCIl39aPH3H1Wsdb1Az92W2bEeCe0ieXv3bFubgZK8KcxcoW4t1JL8qRv5-fe5rD2it0SFuEBWDMhbxFqTljLprieDEYLTI9nBVPOEGudQ2WfB7UM3M1WQ-ds25wva_esy_UI_5LxZfFrTaz6q1IHMVSDftbd4e8gO9gdtsT8q8LKvqcNPfNlWXp21DaE6gzEnVZSUBWjV1VZOM5NkNvyUpKdJDmqcHsi8OO5YCY7SFqoBDWsAhKVIYKBc7Fzg7pY833JgJbrN0n1bljaANCLP8-Unf-mpvMx1NUqSCG2vO4yy3wv-h6i5UQ5PyEd1dhZ7tqQ350aBIQby9yCof_pofe-vvikEUJVVqG5FUKeLSWD2FyqmmtgdvTSIqnRKc7JXSzaTFbW_taNxVIU8JeTpy20_NjqkhFsFjg0etvoP7wHnymzcJeYr7n2_JPwIAAP__-PH-vQ">