<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/107404>107404</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Bitcasting to u64 to multiply by 1 should be optimized out earlier
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Validark
</td>
</tr>
</table>
<pre>
In my real code, I have some logic that compiles down to this ([Godbolt link](https://zig.godbolt.org/z/Kv87xTWG8)):
```zig
fn uzp2(x: @Vector(16, u8)) @Vector(16, u8) {
return @shuffle(u8, x, undefined, [_]i32{ 1, 3, 5, 7, 9, 11, 13, 15, 1, 3, 5, 7, 9, 11, 13, 15 });
}
fn neg(x: anytype) @TypeOf(x) {
return @as(@TypeOf(x), @splat(1)) +% ~x;
}
export fn foo(x: @Vector(16, u8)) u8 {
const a = @as([2]u64, @bitCast(uzp2(neg(x))))[0] *% 1;
return @as([8]u8, @bitCast(a))[7];
}
```
Compiled for the Apple M3, we get:
```asm
foo:
neg v0.16b, v0.16b
dup v0.16b, v0.b[15]
fmov x8, d0
lsr x0, x8, #56
ret
```
We can help the compiler like so:
```zig
export fn bar(x: @Vector(16, u8)) u8 {
const a = @as([2][8]u8, @bitCast(uzp2(neg(x))))[0];
return @as([8]u8, @bitCast(a))[7];
}
```
```zig
bar:
umov w8, v0.b[15]
neg w0, w8
ret
```
This same issue is present on x86-64 (znver4):
```asm
foo:
vpshufd xmm0, xmm0, 238
vpxor xmm1, xmm1, xmm1
vpsubb xmm0, xmm1, xmm0
vmovq rax, xmm0
shr rax, 56
ret
bar:
vpextrb eax, xmm0, 15
neg al
ret
```
Pre-optimized LLVM IR ([Godbolt link](https://llvm.godbolt.org/z/d6E4c9fd6)) via `zig build-obj ./src/llvm_code.zig -O ReleaseFast -target aarch64-linux -mcpu apple_latest --verbose-llvm-ir -fstrip >llvm_code.ll 2>&1`
```llvm
; Function Attrs: nounwind uwtable nosanitize_coverage skipprofile
define dso_local i8 @foo(<16 x i8> %0) #0 {
1:
%2 = alloca [8 x i8], align 8
%3 = alloca [16 x i8], align 8
%4 = call fastcc <16 x i8> @llvm_code.neg__anon_1457(<16 x i8> %0)
%5 = call fastcc <16 x i8> @llvm_code.uzp2(<16 x i8> %4)
store <16 x i8> %5, ptr %3, align 8
%6 = getelementptr inbounds [2 x i64], ptr %3, i64 0, i64 0
%7 = load i64, ptr %6
store i64 %7, ptr %2, align 8
%8 = getelementptr inbounds [8 x i8], ptr %2, i64 0, i64 7
%9 = load i8, ptr %8
ret i8 %9
}
; Function Attrs: nounwind uwtable nosanitize_coverage skipprofile
define internal fastcc <16 x i8> @llvm_code.neg__anon_1457(<16 x i8> %0) unnamed_addr #0 {
1:
%2 = xor <16 x i8> %0, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
%3 = add <16 x i8> <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>, %2
ret <16 x i8> %3
}
; Function Attrs: nounwind uwtable nosanitize_coverage skipprofile
define internal fastcc <16 x i8> @llvm_code.uzp2(<16 x i8> %0) unnamed_addr #0 {
1:
%2 = shufflevector <16 x i8> %0, <16 x i8> undef, <16 x i32> <i32 1, i32 3, i32 5, i32 7, i32 9, i32 11, i32 13, i32 15, i32 1, i32 3, i32 5, i32 7, i32 9, i32 11, i32 13, i32 15>
ret <16 x i8> %2
}
; Function Attrs: nounwind uwtable nosanitize_coverage skipprofile
define dso_local i8 @bar(<16 x i8> %0) #0 {
1:
%2 = alloca [16 x i8], align 1
%3 = call fastcc <16 x i8> @llvm_code.neg__anon_1457(<16 x i8> %0)
%4 = call fastcc <16 x i8> @llvm_code.uzp2(<16 x i8> %3)
store <16 x i8> %4, ptr %2, align 1
%5 = getelementptr inbounds [2 x [8 x i8]], ptr %2, i64 0, i64 0
%6 = getelementptr inbounds [8 x i8], ptr %5, i64 0, i64 7
%7 = load i8, ptr %6
ret i8 %7
}
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzMWFuP4ygW_jXk5SglGxvbechDpVIZtXZGsxq1eh4jbGOHaQxewEmqHva3r8BO4tzqsuppTckyBA6fz43zQVFjeC0ZmyOyQGQ5oZ3dKD3_RgUvqf4-yVX5Mv8ioXkBzaiAQpUM4Sf4Ahu6ZWBUw0ComhdgN9RCoZqWC2agVDsJVoHdcAMIZ4gsflFlroQFweV3RJYIZxtrW4OiR4RXCK9eef1Q9zIPStduBOHVv7ZZuv_65y8ZwjP3RI8oWKLg8E6C_nnldT9SSeheW4xwtkfRI6A4-MYKqzTCWZg4zbsB6d4UoHTRQwEAaGY7LZ2s2XRVJRjCmRN7gr1fIUtWcclK9wORxRqRJY8wShcQuqHIvYh7pe41c6_Qz4R-KvRzHxQFlC69Cwb93M-RLyoJktUHw6l8sS8tGwz9-tKy3ys_d99CalygLoS9YXFgWkGtc9TBeXiBMIH_7u9pw_at0hYqCZVSH4hGl43UKpQ0FiigaHlSjCwwIssuiQeVcm6fqHFKDQE_WN8nSv-QRYDIEhB-dOqGR21vWU4WmftAdoVPj2CpS9xriw9ZOHbAU78VSqiUBrth8Ni2gsFvPpg7BjWzl8l8DkZNMwRWqaMkDH-S1b7dBg9hkjvIoddLlV17PZ0jsgiJs-AMqmrU1rV7b3gZnM8Ko327D3zS987BEUnOxTSzb_jiTwYFlbBhovWuGMqEBsG_uyLy7q4-ZVNO9WezCeCthLof9nfT6idm07VLnCMus6IbQrnL3g75IXt2Pqi77DOx_OoquqENA25M597QamaYtKAk7LNkmsSu4r_KLdPxGyX7zfzetq7elrBvmj7xhhZH2aXgXrkM3TdNOAie2kvILs9hDBkeoc8lG7X9j3ME3d8WMJt-UwwCx81w8tzdEG1btrc6BzbC7ongZoSouIK-FZR_azZVreUNf2Ul_Prrt9_gyx8f5F0hts0N4i2T57iYVWUybKktp9DnH-QdF-VU5X_BA8Iro4sBZe1OBw9OYvo7_MEEo4atqLEwtVTXzAKlutgk8VRw2e1h2hRtB9SVxbWgljnB6ZbpXBk2dXhTrmFaGat5Cyh6Pn1CCMAoekY4CS_3ybmP3JJhKFrAqpOF5UrCo7XaOQCk6uSOyxK6naW5YCCVoZJb_srWhdoyTWsG5jtvW60qLliP1XM-lEathSqoAJ65Ld4THYqewgT2wDMUPQPCJOj5MgpOJSkcpQXCBPu6RIUDc6eIrF_uYvUEVPBaQjYSjy7ED5-7Jx97-YIKARU1tijgQsc4OLlWsnq9plLJdRiT9J49I3TyGfShpF5hxiNMY5VmlyCY-KNRa7X3wLWhCJPEa1IzywRrmLROmMtcdbI0zk_YwSXx4KcxFE9iCE6dk3GphxSKln7laVlyri33NY-kIwl8OxjZO0qexX4MdaZjOkKcjXTMRquOX9XM-gzFZHbzqPY3bA0uLdOS_riEg05K2rByTctSf2Q3OVa4hfTkRnkGU1_7_wmd6Pl6b5flpfJe6cOqn9-6YvvU5-KRj679G_0D8-teyfm_smq4BG79sfN-fo1G_Q3xbDjCh4BGePBwhPvrn-uQQyc9dGaHTniUDo_i4VH-R0CdcvFmfPHPiu8ltfan_h9Arbe4Mrzaf38XV36Kie8lbvQ-V8a3eSg8cSX5AFeesdE7hDQmzfd4-BbFkTcpLr1Dcck1xaX3blOTch6Vs2hGJ2weppiQOCaETDbzWUpxFiRFTBMc5CTMCY6DIkhZmZJwhoMJn-MAx8EsIGFECCEPSRVWJMZJkoQFnaUYxQFrKBcP_iitdD3xV6N5GKRxEE8EzZkw_j9sGEu26y9OCLvb50TP_Uk372rjos-NNScYy61g8wW3BTWWyxqsgi6JXdN0wvJWvED-AiGYjepECTmD0y1AdRYY1YIzPem0mJ-f-mtuN13-UKhmOLwPzbTV6i9WWIRXXk2D8GqwYzvH_wsAAP__Vm9cyg">