<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/92211>92211</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Spurious optimization triggered by a `zext i16 %0 to i64` but not `and i64 %0, 65535`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Validark
</td>
</tr>
</table>
<pre>
I define the following 2 Zig functions: https://zig.godbolt.org/z/3xfc5bjEc
```zig
export fn foo(x: u64) u64 {
var y: u64 = @as(u16, @truncate(x));
y = (y | (y << 24));
y = (y | (y << 12));
return y;
}
export fn bar(x: u16) u64 {
var y: u64 = x;
y = (y | (y << 24));
y = (y | (y << 12));
return y;
}
```
Emitting for Neoverse N2, I get:
```asm
foo:
and x8, x0, #0xffff
orr x8, x8, x8, lsl #24
orr x0, x8, x8, lsl #12
ret
bar:
mov w8, w0
ubfiz x10, x0, #12, #32
orr x9, x8, x8, lsl #24
orr x8, x10, x8, lsl #36
orr x0, x8, x9
ret
```
Here is the LLVM IR:
```llvm
define dso_local i64 @foo(i64 %0) local_unnamed_addr {
Entry:
%1 = and i64 %0, 65535
%2 = mul nuw nsw i64 %1, 16777217
%3 = mul nuw nsw i64 %1, 68719480832
%4 = or i64 %3, %2
ret i64 %4
}
declare void @llvm.dbg.value(metadata, metadata, metadata) #1
define dso_local i64 @bar(i16 zeroext %0) local_unnamed_addr {
Entry:
%1 = zext i16 %0 to i64
%2 = mul nuw nsw i64 %1, 16777217
%3 = mul nuw nsw i64 %1, 68719480832
%4 = or i64 %3, %2
ret i64 %4
}
```
Here is the LLVM IR produced by Clang for "equivalent" C code:
```llvm
define dso_local range(i64 0, 4503599627370496) i64 @foo(i64 noundef %x) local_unnamed_addr {
entry:
%and = and i64 %x, 65535
%or = mul nuw nsw i64 %and, 16777217
%shl1 = mul nuw nsw i64 %and, 68719480832
%or2 = or i64 %shl1, %or
ret i64 %or2
}
define dso_local range(i64 0, 4503599627370496) i64 @bar(i16 noundef %x) local_unnamed_addr {
entry:
%conv = zext i16 %x to i64
%or = mul nuw nsw i64 %conv, 16777217
%shl1 = mul nuw nsw i64 %conv, 68719480832
%or2 = or i64 %shl1, %or
ret i64 %or2
}
declare void @llvm.dbg.value(metadata, metadata, metadata) #1
```
And the assembly:
```asm
foo: // @foo
and x8, x0, #0xffff
orr x8, x8, x8, lsl #24
orr x0, x8, x8, lsl #12
ret
bar: // @bar
and x8, x0, #0xffff
orr x9, x8, x8, lsl #24
lsl x8, x8, #12
bfi x8, x0, #36, #16
orr x0, x8, x9
ret
```
On x86, compiling for Zen 4, I get:
```asm
foo:
movzx ecx, di
movabs rax, 68719480832
imul rax, rcx
mov rdx, rcx
shl rdx, 24
or rdx, rcx
or rax, rdx
ret
bar:
movabs rax, 68719480832
mov ecx, edi
mov rdx, rcx
shl rdx, 24
imul rax, rcx
or rdx, rcx
or rax, rdx
ret
```
Looks like LLVM is making the decision that a multiply is less expensive than a `shl`? Also, why can't we use `shlx` here? I would think we could do:
```asm
shlx rax, rdi, 24
or rdi, rax
shlx rax, rdi, 12
or rax, rdi
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzUWN-P2roS_mvMy6gocX5AHnjYskV3pd5e6R6pD32pnNgJ7jo2x3Yg8Ncf2QlLgNBlu33oiRAJ9jcTz3zzOROIMbySjC1Q8hEljxPS2LXSi69EcEr08yRXdL94AspKLhnYNYNSCaF2XFaA4RuvoGxkYbmSBkUPsLZ24y4QXiG8OvBqWimaK2GnSlduBOFV1JZFkv_4VKDgEQUP_XcadJ8Dr7oR1m6UtlBKKJVCeN46_00aI5y5E6DZxw4IALAlGvY9AFD0CCgOiEF43oQpwkv30-pGFsQy7wpn7hMNPOw7MzzfA5ot-4toiaIl4Pht-BCP4DWzjZZukf0gmj0OE3AKNyf6JVy3-nvCbf-UWI40DkP7VHNrXcGUSsMXprZMGwZfsCPmCSpmXcGM1QIxdTfiKuCIgf4gkvpzO3d-2sDTjKOgLcuy7KFK6yFm8C2McGgcnzt9MQhuGIS4N9DMDtfsSLtcYa22_rzzDnbB-WyTl_zgbhUGZwGEuL-I8I21ZW8NpoOGwZVFlN4RfnaOOQU-xvV_mGbAjd8pPn_--l94-v8tdoXY9vT2uws16rtQBRHAXVnHQSd8_wMngVOCn_7eSElqRr8TSvVJGJ-k1fsBCQgnoS9sVyknJ0tIkyRKBijsUXUjQDY7kGZ3RIcOHaaz2QyHs4FB9FODdD4Ls3gezE8EIpx0OlX6CI06lpMXjGb2OBeP7hCUFYJoBlvFqcuOy9-U5tV0S0TjtrWaWUKJJc7z-HXmC-zc6Xjqu02IhykcmFaste-j4OA8OG_OC1jlbvOvYuDOWoeNVrQpGIV8D0tB-k0PYcz-bviWCCYtwhiWUCjKfkEZmsiK9aLwxRwnQZRkWYpn0SyIM_-8uJKPVI2krHSRta9xyK45dAK6EFI7JiQX6TgtRNIbTJq1CF-xGqdTaXxBqHPVc6r0CKlK4xvCekeST0J5R5IRTgolt1dSaa-lcjvHzsNIkl_J8dHqPUm-K8e_d_Ma0-ODpF6LxBhW52J_Z1sBXbN6VMwf2WR07QXccZyCcTZvDuZXew03fJmEYST9kZd8bAlResS_pSH5eSfyPwnt3PstVL3h4tiAfmMS4ve1nrXaHloAYIXfBim_mia5AdCkvaGs7uBOjvCC00U73j9qejFv1mI4cV1fMG53Od_fl7Zv6mnvis2tvU8PG8nPT9d3K7zX8vUb4h4rpM9KPRsQ_Ll_ynMDNXl25eQ2G8oKbriSYNfEAnFbrOUbsXcwwYwB1m6YNHzr3p6JBAIoDcxauJtEK3gQRvm3g_UeCiIRnlnYMWgM63EtSgNYM80c-gl2qhFul-Py2eEK_5OqV8t4kNv2LAf8LMWnFPpxh7rD_FLlSg8B5yua0EVEsygjE7YIZ2Eyi9MEh5P1IiYkj6Mwx2GaudE4ZmlRUlKkOMUlIRO-wAGOgyRMgjhMcDhNI5bGWTSPs3lESBagOGA14WLqHy5KVxNuTMMWGcZhOBEkZ8L4_zkwlmwHfhJhjJLHiV44mw95Uxn3cOLGmpMXy61gi782jeaqMaA2ltf8QKxnXfOqYrpr-Dy3o61uGkDeWJDKOsj4C0kaTBotFud_oVTcrpt8Wqga4ZXvC7vTh41WP1hhEV75OAzCKx_nPwEAAP__a4Wg3w">