<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/73999>73999</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Missed optimizations when evaluating a 3-way switch with two possible destinations
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Lucretiel
</td>
</tr>
</table>
<pre>
When compiling a 3-way switch with 2 possible destinations and an unreachable default block, LLVM misses some optimization opportunities and emits superfluous instructions.
# Version 1
Consider this IR:
```llvm
define internal fastcc void @branch1() unnamed_addr #0 {
start:
ret void
}
define internal fastcc void @branch2() unnamed_addr #0 {
start:
ret void
}
define void @start(i8 noundef %0) unnamed_addr #1 {
start:
switch i8 %0, label %bb2 [
i8 1, label %bb3
i8 2, label %bb4
i8 3, label %bb1
]
bb2:
unreachable
bb3:
tail call fastcc void @branch1()
br label %bb5
bb4:
tail call fastcc void @branch2()
br label %bb5
bb1:
tail call fastcc void @branch2()
br label %bb5
bb5:
ret void
}
```
Here, we have a function `start` that switches over its argument, calling `branch1()` if the argument is 1, and `branch2()` if it's 2 or 3. We assert that other possible arguments are unreachable. On LLVM 17.0.1 and the current trunk as of this writing, IR compiles to this assembly:
```asm
branch1: # @branch1
retq
branch2: # @branch2
retq
start: # @start
cmpb $1, %dil
je branch1 # TAILCALL
movzbl %dil, %eax
cmpl $2, %eax
jmp branch2 # TAILCALL
```
As expected, it creates a single test that `%dil == 1`, with a conditional jump to `branch1` and an unconditional jump to `branch2` otherwise. Oddly, it appears to create totally superfluous `movzbl + cmpl` instructions; this is probably related to the next bug we're about to see.
# Version 2
Consider this variation. The only change is that instead of branching like this: `1 => branch1(), 2 | 3 => branch2()`, we branch like this: `1 | 3 => branch1(), 2 => branch2()`:
```llvm
define internal fastcc void @branch1() unnamed_addr #0 {
start:
ret void
}
define internal fastcc void @branch2() unnamed_addr #0 {
start:
ret void
}
define void @start(i8 noundef %0) unnamed_addr #1 {
start:
switch i8 %0, label %bb2 [
i8 1, label %bb3
i8 2, label %bb4
i8 3, label %bb1
]
bb2:
unreachable
bb3:
tail call fastcc void @branch1()
br label %bb5
bb4:
tail call fastcc void @branch2()
br label %bb5
bb1:
tail call fastcc void @branch1()
br label %bb5
bb5:
ret void
}
```
On the same versions of LLVM as above, this produces this (worse) assembly:
```asm
branch1: # @branch1
retq
branch2: # @branch2
retq
start: # @start
cmpb $1, %dil
je branch1 # TAILCALL
movzbl %dil, %eax
cmpl $2, %eax
jne branch1 # TAILCALL
jmp branch2 # TAILCALL
```
Instead of a single conditional jump, we now have two conditionals and two conditional jumps, even though the entire function could be reduced to a single conditional jump on `%dil == 2`. This appears to be the source of the superfluous instructions up in Version 1.
For context, this bug was discovered when exploring different optimizer behaviors related to Rust enums. The LLVM IR used above is based on the compiled output of Rust code approximately resembling this:
```rust
pub enum Value {
A = 1,
B = 2,
C = 3,
}
pub fn start(value: Value) {
match value {
Value::A => thing1(),
Value::B => thing1(),
Value::C => thing2(),
}
}
#[inline(never)] fn thing1() { std::hint::black_box(1); }
#[inline(never)] fn thing2() { std::hint::black_box(2); }
```
This bug appears in LLVM 17.0.1, and in the current trunk. This bug also appears in LLVM 16 and 15, but those versions generate less optimal assembly in general (several additional unecessary unconditional jumps resembling `.LBB2_2: jmp branch1`). The assembly was produced with godbolt using `-O3` ([example](https://godbolt.org/z/Ej6nohq6n)), and I was also able to reproduce this bug using rust code in the [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=7c89158202218a3586a0b0c17cfd3271) and on my local ARM mac, using the same rust code.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWN9v2zgS_muYl0ENiYps58EPsb3BBUjRQ1F0HwtKGltMKVJLUnbcv_4wpGxLTtwfi1vgDrtGEdci55shOTPfRwnn5FYjLli-ZPn6RnS-Nnbx1JUWvUR1U5jqsPi9Rg2laVqppN6CgOzdXhzA7aUva9hLXwOH1jgnC4VQofNSCy-NdiB0BUJDpy2KshZxfCM65aFQpvzK-Aqenj6_h0Y6hw6caRBM62UjvwUIMG1rrO-09BIjHjbSO3Bdi3ajOtM5kNp525XB5YQla5bc9395Bp_ROgJKhwMro52s0IKvpYPHjyy7H9lNk_hPqV0TH1W4kRpBao9WCwUb4XxZws7ICthtUlihyzplfM74HXRaiwarL6KqLDCeJcBmy4jjvLD-5A7Aog8gvePZehjHTznlf43To5Noy-dyDtp0usINMJ4nb3lMr3vsk0XOe-MVKFGgol9FwYHyr58JNCm9mJGNRvnF6O1oNLsYTY-jLB-tsyj4OcBBho7nZINFeCEVlEJ9__SPsws7jCIfw97-PCz_Bdj0r4HNfyJ3jiUztPwXWqTT2CPUYocgYNPpUKfApn1mTRPwtfB9hqADs0MLVOLCbrsGtScEWgc1Hza92OxpAnIDvsbTdJAu5g81i9N8PpovPeMzBxyMhWwCvyMI59D6GIrxNdpzRzsCU0Q4zJQJfNCxfaWzSTJJg0cKpeyspUi87fRXEA7MJnaavZVe6i1F9_ix76nowJs4TEE0hTpca0fC9d3ouAfZPdXdMAVPlUAfi_6PoQElPPzgM8Lj1_CO9f0jtDNgtBjBlU1bxBm34cAYzyupxnOeMX736wton-4fn1b3T0_jmY3ZfSsUHFEiHoqXVz5V75Nfm_PctAOf_LsrG8fyZhXcO8CXFkuPFXmUHkqLwhOdgZN6qxA8uj73yDYsAFi2Ztka0vBkFXlWQGl0JamChILnrmkpdwZFMU3OnPvdqZymhkTfS0eZXFXq0Icn2haFDWkZIwVvvFDqMGJdNk36LWd8GfY1FNeAi1m2jHktHbTWFKJQB7CohMcq5jyCxhcPRbeFPTI-swiiMJ2nUYd4lcz5dTLfCSuDdJjApxrBaHWAshZ6ixRG2GOKEUVFVRk3gzqLkl8xIISimiZpPIDfYNxv-Ao4sNkKsvH4ub_0DS8-fgv2tfEY_ArsPxLl7y1R_oYa5Vei_bMa5YMOfciJBmEX20vg68DrwlE_2gURE7pLa03VlUTZ9Ivx-d5Yh5Rpf4a7f4WK_wep_f-Q2TWOfH5vZRex_JdFweOZg04q4JKweybRZh_ls9-b4Zx4Hb54GAwdWeIOKbVNt61DhqP20uJZgZemUxUUCBYppQMjXw0FomQfSxPSEMSypFzPkqHAWFCmsyVG5YtXr-vQtSD1-ZI-YvwHYykQjy_-VIBBKggHlXQl3RSwgn2NmhSWMpaIvJKbDQb53b9IQAsF1mInjXVD-fGxcx5Qd42LUiFU_ONH6BxWsexJMRSCfprYJXrFXoHpfNt5Wl1AKU2FtAfWvMhGeAw6J_QDiqin_zfbgu1cXzttV4Ro4LNQHZ55ipLqHqIU5KvzsyXEMxg-W4Vn2enZBW-Si42GI1_uyBFVevBITezstBFEhrvXodDnc2_Isvv7o17xpKLOQuaqwfJXDVYjA35pcF7ieK2MZyxfSq2kRsbnGndoyTBf0xYMndP6wPkququl9vF_hRLl1y-FeWF8npJpthx4-zE8_3l4_gr-rZ7x6VgAx2qTo-vn8c4r9etLaF-mwVg58xphGkzTnEAKEuC1cQNC3KJGSzcBhc7FwhLqxHkEE2cQN88dbQYNV6ce0mks0TlhD29cTNywWKinPC2X_EtgNOq657sN43exVE-OqRX0nFzFK9LWVIVRHjrXo737kNHFhA4jX-KLaFqFpLP4vPa-DYXJHxh_6A0nxm4Zf_jG-MNvz1Nt6j-mOiTc3XF_H4PXuI0FXd4MWOyDODep6N-eukN_Kixfho7xbyUOW0tq9a1YWiUOE7J9p4Te9iGx7KE_D5atnQ86kE8bUyHL1hYVClIiU4x7y7I1T3jK-HQrnWfZelbO79J8zhPO07nI8vlUJEVSprNyU2V8lgYRo0Oraw6gTCkU3H98D40oaeVxPSepdFrY5KZaZNVddiducJHOkjSbZ7e3s5t6gXk-28yKqkhmSVHms6zkSXWLeZFtsk2R5jdywROepWmWcJ4lt7MJJrwqkzKrMj6lP-w2wUZINaFLDu3CjXSuw8Usu7u7uwla0IV315xr3EMYZJyzfH1jF2Tzrui2jt0mSjrvziheeoWL99KF1j543ex6MqG-J_y1t91EuW--777prFpcZJX0dVdMStMw_hDuavHrXWvNM5ae8YcQtmP8ISzrPwEAAP__t0vdSA">