<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/107065>107065</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [Aarch64] A shufflevector that shifts vector elements to the right emits a `tbl` instruction rather than an `ext` instruction
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Validark
      </td>
    </tr>
</table>

<pre>
    This code:

```zig
export fn shiftElementsRight1(d: @Vector(16, u8)) @Vector(16, u8) {
    return @shuffle(u8, d, @as(@Vector(16, u8), @splat(0)), [16]i32{ -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 });
}
```

Gives this emit when targeting Apple M3:

```asm
.LCPI0_0:
        .byte   255
        .byte 0
        .byte   1
        .byte   2
        .byte   3
 .byte   4
        .byte   5
        .byte   6
        .byte   7
 .byte   8
        .byte   9
        .byte   10
        .byte 11
        .byte   12
        .byte   13
        .byte 14
shiftElementsRight1:
        adrp    x8, .LCPI0_0
        ldr q1, [x8, :lo12:.LCPI0_0]
        tbl     v0.16b, { v0.16b }, v1.16b
 ret
```

LLVM emitted by Zig (`zig build-obj ./src/llvm_code.zig -O ReleaseFast -target aarch64-linux -mcpu apple_latest -femit-llvm-ir -fstrip`):

```llvm
; ModuleID = 'BitcodeBuffer'
source_filename = "llvm_code"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknown-linux-musl"

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable
define dso_local <16 x i8> @shiftElementsRight1(<16 x i8> %0) local_unnamed_addr #0 {
  %2 = shufflevector <16 x i8> %0, <16 x i8> <i8 0, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <16 x i32> <i32 16, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14>
  ret <16 x i8> %2
}

attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable "frame-pointer"="none" "target-cpu"="apple-latest" "target-features"="-a510,-a520,-a65,-a710,-a720,-a76,-a78,-a78c,-addr-lsl-fast,+aes,-aggressive-fma,+alternate-sextload-cvt-f32-pattern,+altnzcv,-alu-lsl-fast,+am,+amvs,+arith-bcc-fusion,+arith-cbz-fusion,-ascend-store-address,-b16b16,-balance-fp-ops,+bf16,-brbe,+bti,-call-saved-x10,-call-saved-x11,-call-saved-x12,-call-saved-x13,-call-saved-x14,-call-saved-x15,-call-saved-x18,-call-saved-x8,-call-saved-x9,+ccdp,+ccidx,+ccpp,-chk,-clrbhb,-cmp-bcc-fusion,+complxnum,+CONTEXTIDREL2,-cortex-r82,-cpa,+crc,+crypto,-cssc,-d128,+disable-latency-sched-heuristic,-disable-ldp,-disable-stp,+dit,+dotprod,+ecv,+el2vmsa,+el3,-enable-select-opt,-ete,-exynos-cheap-as-move,-f32mm,-f64mm,-faminmax,+fgt,-fix-cortex-a53-835769,+flagm,-fmv,-force-32bit-jump-tables,+fp16fml,-fp8,-fp8dot2,-fp8dot4,-fp8fma,+fp-armv8,-fpmr,+fptoint,+fullfp16,+fuse-address,-fuse-addsub-2reg-const1,-fuse-adrp-add,+fuse-aes,+fuse-arith-logic,+fuse-crypto-eor,+fuse-csel,+fuse-literals,-gcs,-harden-sls-blr,-harden-sls-nocomdat,-harden-sls-retbr,-hbc,+hcx,+i8mm,-ite,+jsconv,-ldp-aligned-only,+lor,-ls64,+lse,-lse128,+lse2,-lut,-mec,-mops,+mpam,-mte,+neon,-nmi,-no-bti-at-return-twice,-no-neg-immediates,-no-sve-fp-ld1r,-no-zcz-fp,+nv,-outline-atomics,+pan,+pan-rwv,+pauth,-pauth-lr,+perfmon,-predictable-select-expensive,+predres,-prfm-slc-target,-rand,+ras,-rasv2,+rcpc,-rcpc3,+rcpc-immo,+rdm,-reserve-x1,-reserve-x10,-reserve-x11,-reserve-x12,-reserve-x13,-reserve-x14,-reserve-x15,-reserve-x18,-reserve-x2,-reserve-x20,-reserve-x21,-reserve-x22,-reserve-x23,-reserve-x24,-reserve-x25,-reserve-x26,-reserve-x27,-reserve-x28,-reserve-x3,-reserve-x30,-reserve-x4,-reserve-x5,-reserve-x6,-reserve-x7,-reserve-x9,-rme,+sb,+sel2,+sha2,+sha3,-slow-misaligned-128store,-slow-paired-128,-slow-strqro-store,-sm4,-sme,-sme2,-sme2p1,-sme-f16f16,-sme-f64f64,-sme-f8f16,-sme-f8f32,-sme-fa64,-sme-i16i64,-sme-lutv2,-spe,-spe-eef,-specres2,+specrestrict,+ssbs,-ssve-fp8dot2,-ssve-fp8dot4,-ssve-fp8fma,+store-pair-suppress,-stp-aligned-only,-strict-align,-sve,-sve2,-sve2-aes,-sve2-bitperm,-sve2-sha3,-sve2-sm4,-sve2p1,-tagged-globals,-the,+tlb-rmi,-tlbiw,-tme,-tpidr-el1,-tpidr-el2,-tpidr-el3,-tpidrro-el0,+tracev8.4,-trbe,+uaops,-use-experimental-zeroing-pseudos,-use-postra-scheduler,-use-reciprocal-square-root,-use-scalar-inc-vl,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,-v8.7a,-v8.8a,-v8.9a,+v8a,-v8r,-v9.1a,-v9.2a,-v9.3a,-v9.4a,-v9.5a,-v9a,+vh,-wfxt,-xs,+zcm,+zcz,-zcz-fp-workaround,+zcz-gp" }
```

It's weird because shifting left works just fine.

```zig
export fn shiftElementsLeft1(d: @Vector(16, u8)) @Vector(16, u8) {
    return @shuffle(u8, d, @as(@Vector(16, u8), @splat(0)), [16]i32{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, -1 });
}
```

Assembly:

```asm
shiftElementsLeft1:
 movi    v1.2d, #0000000000000000
        ext     v0.16b, v0.16b, v1.16b, #1
        ret
```

The LLVM emit is almost identical, except the shufflevector is permuted, obviously:

```llvm
  %2 = shufflevector <16 x i8> %0, <16 x i8> <i8 0, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <16 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16>
```

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWduO5CjSfhrXDSJl40M6L-qiDl2_Wur5ZzXbGq32poVxOE03Bg9gV1Y9_YqDnemsqtmDRlqNtK0Z88VHEEAQhJ1R1Bh-lAC3SXmflI83dLK90re_UsFbqn_cNKp9uf3ac4OYaiHJ75L0MUmXZ5WG_175MTBwGpW2qJPI9LyznwQMIK35hR97myWkbpP8DiVF-iswq3RC6qxKyAOa6oQcEnL4qAsl-_swAUIIabCTlk7X9FPXCUhI7dQeUOseSZFSk5D6w2m8ihkFtQmp0zCzZ8v7rErKR56TZH-PcObI1D08Iu6Ru0fhHqV7eLN79_DzezNZGOLHZH5Q5kdlBUr2j26uPG7GiVtHXnr3__gMBlnnexi4Rc89SGSpPoLl8ojuxlEA-in_6EyoGQKz-_Lwl8_pt3RVRPHfrnmxgBAiZfleR_q-dvaBkffpPNKLXLyv9u4CEKrep_dXRuv31Q4fbODdjWUf7Cv7YGNZ_q6VuL_3ov_a_bTVo2tPPnTWU9roiFaj37IYnEExye-EykiSnw-2fNyOso3w7Zzusqrxg_b3UQox-IDmzPeFcRrs7wTily-__uRD0EKLmhf0d35E7n75e4-aiYsWq-Y72iXkyWiWkCch5uGbSxg7p4F_Rr-AAGrgiRqLcIhhRKlmfVVgweV0Qnhg44SoC-pvglpwip2bFDtjmGuEO2M1H93i3B16P-qdcqTye_STaicBnx9Rkj-ihOzvuXWrup-6DnRCYhwZNWkG3zouQNIBojJZN5GQGANx4S21VNAXNdlFFfCQ5HeAeZ3kd-7_nGCeVUl-5x9Oqookv6sKzDPiFDJSY5mTwP3VcVeTuK2KdS2Lryb5Q6pnGXyGh8mIddx510-TZJYrie6s1cZl3GEydtTqqMEYJFWnAZBUGtikjUPmRTIk1SSfuWzRMxci5tgBBqVfElJLJcEl4unZ0kZAmKyFjktArVHfhGJUoCR_yCp0Qs4Pn0KCfu8tcKVGSneiyJv4Nkl3Bu032rYaJSRPL3N_QkriXRIT_-zz-_W03t7DNZs_8Dpkc16jUXGj5J9UyD9ttueiKOwvJyi86BxKF5AtgCwgX0CxgHIB6_D9AuoFHFaDZ9Or7Ww1nq3Ws8ItNR6dBvv2nMj1e9A_qbWaN5MFEwPA3YH9_R8fxe5mdZoOgEfFpXUpgST5Y0JIUCROIdxHzMZp7fVZCocstdXqgNpJg1lVMS29uzAtSWir0jf7SO8jva9CU8eG-bZtNRZG4I66mR4Sck-d8QdMj94NfAbcDTR2CQtaUgvYwMkKRVvMZou7nOCRWte36slXNnszYrq2PyztbCLS3Pa4YQx3k-FKXrKseT2zmBoGssXGKg1-7WD8Ypusanxg4oYKKhngbsRqjPabLvbpBiJjuSMYFQIbOkOLT8FbGyZ7w5A3TP6GKd4w5RumvmbeEIewUMbacUG8PS1wHL1-_8M3Qjd949EwvnEjU8MoTnKKXn_4-f-_fvrb18-Pv3z6EnajtIUT1nWQxnjUzL1mA3gZrfJ9xviYaf3LxPW13Lgo94Eq2Qs2rIcW9zBpbiwPyouK38cqGjsuJmJYtMpdvTYI4IPHAUHmwdBF8N4GGUyAAGaxGq0nLfjm9CKVwawHOmJq8KBmz3c5GQYPqiICOnA50OjS7uitdPy0OISWOa7zcl_Fk-gEPYaBgw_sTmkGOCcNt_j7NIzYX_gYcd2YVd0gvN5Yx6ZVlpxhEeF6uboRUz3MUXvQC2td5ojCJIQzvUhmcwkW2UwNJhqOmClpbHbRpUfXfzl6XbCX_I0T6sjZBRsCAIPSl6QBcSEKbkFT4ZdxZL7pqW5BYiMMboS-YqRiamipvaI12CaoNnEFPYsHxOtwbNzGK_zdMCX9SYh2xFS4H5ktVlK8hH7h14uFqYpIGAgErPErDPgTEZNfyQA-ZIc1cQyjT1Z4WCaVEPKQHHz-kAo3lmNqcXgPYPvMGcQeCUfMhwFa7tJ4JM3sU5NoMx2ZV_aKu3gZwn7UZAWXgKlVA2dxKSOVK8D6eV6EyfZujAdYxCMaQXdDWOmooeXMXt4YOI0gXWKPyhpaHRY46m7ARrD4Ae0oTWUMGE1NIMxMIsNG7zDX5mfK7VpFsfX-02BAz4BP2VZKt-JVL9mK-VYstmK5FeuNuLVEttOS7bTkSnk7LdlOS7bTkmor7rfidlFbw_l2UdtptrNsJ9nOcfDSEI_WNLEFEY_M9PSM_AqMUM944Ga5QRmp_et17Rsp14FfKWP1b1rhs95QhCZK4Va5dswiwl1Wxdewl6qiq4pVqi-76i4nq0DPWjyr-FkSk52D2gixwQBdhEyDWTYaJKs5i2nUmMaHsgm3cU3MF3JxKa8pOnx3OIdgM43jkniNfZOAcJgv0F4O7yEzR9_MQGL2DbjhdgQ9rPJ6PF6I_p0Xh1p6PEKLj0I1MevaPp65FQ3WIT1Z0fBnD8LB2JG3GoPILgVyKeSroBUGkUaTmjKY651fhF0_oiYaEiV2bwCXVDR3v8KowK-gFZdHPBqYWrXqjMpYTcNXwiRAL7wGxkftfpxh89tENWCtlF16DaOCaswlw3N858z1LqMrJGeYn2FxhuUZVg7iud7tF1Av4LBqRcavbz6EqRwgC8gXUCygjGCx4ZPyc3fyuzjFHP7KhgW8Oj6kfvys9A-q1bQkWkcfR__Z_7vlu882IXuDnoHrFjXA6GQgVEW5PCIBnUXOtEHfJ2OR-yG9-49qq1-g-9OVVv-geqp_-nE4-_dqq3fGwNCIl39aPH3H1Wsdb1Az92W2bEeCe0ieXv3bFubgZK8KcxcoW4t1JL8qRv5-fe5rD2it0SFuEBWDMhbxFqTljLprieDEYLTI9nBVPOEGudQ2WfB7UM3M1WQ-ds25wva_esy_UI_5LxZfFrTaz6q1IHMVSDftbd4e8gO9gdtsT8q8LKvqcNPfNlWXp21DaE6gzEnVZSUBWjV1VZOM5NkNvyUpKdJDmqcHsi8OO5YCY7SFqoBDWsAhKVIYKBc7Fzg7pY833JgJbrN0n1bljaANCLP8-Unf-mpvMx1NUqSCG2vO4yy3wv-h6i5UQ5PyEd1dhZ7tqQ350aBIQby9yCof_pofe-vvikEUJVVqG5FUKeLSWD2FyqmmtgdvTSIqnRKc7JXSzaTFbW_taNxVIU8JeTpy20_NjqkhFsFjg0etvoP7wHnymzcJeYr7n2_JPwIAAP__-PH-vQ">