<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/107392>107392</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[Aarch64] inlining `uzp2` causes superfluous `ext` to be emitted
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Validark
</td>
</tr>
</table>
<pre>
This code:
```zig
fn uzp2(x: @Vector(16, u8)) @Vector(16, u8) {
return @shuffle(u8, x, undefined, [_]i32{1, 3, 5, 7, 9, 11, 13, 15, 1, 3, 5, 7, 9, 11, 13, 15});
}
fn fmov(x: @Vector(16, u8)) u64 {
return @bitCast(@as([2]@Vector(8, u8), @bitCast(x))[0]);
}
export fn lower_uzp2(x: @Vector(16, u8)) u64 {
return fmov(uzp2(x));
}
```
Compiles to this for the Apple M3:
```asm
lower_uzp2:
uzp2 v0.16b, v0.16b, v0.16b
fmov x0, d0
ret
```
However, if we inline both the `uzp2` and `fmov` functions (put an `inline` before `fn`), we get an extra `ext` instruction:
```asm
lower_uzp2:
ext v1.16b, v0.16b, v0.16b, #8
uzp2 v0.8b, v0.8b, v1.8b
fmov x0, d0
ret
```
Here is how Zig inlines these two:
```llvm
; Function Attrs: nounwind uwtable nosanitize_coverage skipprofile
define dso_local i64 @lower_uzp2(<16 x i8> %0) #0 {
1:
%2 = alloca [16 x i8], align 8
%3 = shufflevector <16 x i8> %0, <16 x i8> undef, <16 x i32> <i32 1, i32 3, i32 5, i32 7, i32 9, i32 11, i32 13, i32 15, i32 1, i32 3, i32 5, i32 7, i32 9, i32 11, i32 13, i32 15>
store <16 x i8> %3, ptr %2, align 8
%4 = getelementptr inbounds [2 x <8 x i8>], ptr %2, i64 0, i64 0
%5 = load <8 x i8>, ptr %4
%6 = bitcast <8 x i8> %5 to i64
ret i64 %6
}
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysVk-P46gT_TTkUpoWBv895NBJJvpdfrfVHPbSwnbZYccBC3CS7U-_Khx3kpnJqqXZVgvK8OpR9UxVrLzXvUFcs2zDst1KTeFg3fqbGnSr3PdVbdu_138ctIfGtsjkK-M7xpcx5_P_u-7nlc7A9D4KJsoLk6_AUv4Nm2AdE2WSM7GFqWSiYqJ6tgWs2MxUAAAOw-QMYf1h6roBmSgJtoVL9DAtdtpgSw8s27yxbKelYMUmoRVJQ0ZDQUNFQxJ3kriVxL3PQosdRS6v0dHjnRKdge5oT5_Ie8rTZznWOmyVD0yULOXK05xtBMt293TlHdv20esyH8GyDSen59HiZbQuQGdgsGd0b598Zc9Cv2b-wTJH8fPZy225D2Vrj6Me0EOwEOiaddZBOCC8juOA8H_57Mopf5xX7lJYoHD9o0WaT_wlyWvK5SfrAU-J0HzhhGj5467D8C-J_M-e8YSOHHUHZwRtBm0QahsOMSGW8xhkzkGZlh6jbjmHbjJN0NZ4YKIcpwDK0PZMQIAaO-siQ2fo2PndnxF6jGC8BKdoGy-B8Nr44KbI-Tvy4SXAKXkqHV0_IcunkpcL-GokZMzg3xIaHYL2cLBn-FP3V509aewRwtk-S3kYTtecmdzA_io6vIbgPN18Yydz1qaF6RxUPSAY65XRQb_jW2NP6FSP4L_rcXS20wPOXHMHgtbbt8E2agBNVZLyh8picpvkcAFdMvkVmMh4bHVC8ltBJTf9mcgEMLkDNRAntbbFnep6C2rQvYEP6ZnIZMRfu-QpFjD86tTtD6uxhT4sSxHRcqulmJsjGXIxssUoFqNajOQDnXzAkw_8f0Elvy4p-xAr4scMI3gMLkr4a6XSqFSPAQc8ogmE1qa2k2k9KS3gQrzlQnuV_J6UXjG_GTfuLHIPVrWPFDf_9A6dR3StQ6N8eHCYuYKlAxYHh2G-WyLLn3XWVbuWbSUrtcJ1Uogsk1WSFqvDWnVN2VZpoXjW1bKqRC5rVSleFLLpMlQrvRZcpLziWSJElhYvTYld1SZK1XkpBNYs5XhUenihMnqxrl9p7ydcJ7yQlVgNqsbBx68IIQyeIe4yQT9fK7cmpy_11HuqDO2Dv9EEHYb4-fGqXHPIU5bt5prWpr9vmo2aPHrw04iuGyY7-buOFyzUCHjUIWC7mtywPoQwUlkzsWdi3-twmOqXxh6Z2MdGME9fRmf_wiYwsY8Beyb214xOa_FPAAAA__8MhWzg">