<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/73999>73999</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Missed optimizations when evaluating a 3-way switch with two possible destinations
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Lucretiel
      </td>
    </tr>
</table>

<pre>
    When compiling a 3-way switch with 2 possible destinations and an unreachable default block, LLVM misses some optimization opportunities and emits superfluous instructions.

# Version 1

Consider this IR:

```llvm
define internal fastcc void @branch1() unnamed_addr #0 {
start:
  ret void
}

define internal fastcc void @branch2() unnamed_addr #0 {
start:
  ret void
}

define void @start(i8 noundef %0) unnamed_addr #1 {
start:
  switch i8 %0, label %bb2 [
    i8 1, label %bb3
    i8 2, label %bb4
    i8 3, label %bb1
  ]

bb2:
 unreachable

bb3:
  tail call fastcc void @branch1()
  br label %bb5

bb4:
  tail call fastcc void @branch2()
  br label %bb5

bb1:
  tail call fastcc void @branch2()
  br label %bb5

bb5:
  ret void
}
```

Here, we have a function `start` that switches over its argument, calling `branch1()` if the argument is 1, and `branch2()` if it's 2 or 3. We assert that other possible arguments are unreachable. On LLVM 17.0.1 and the current trunk as of this writing, IR compiles to this assembly:

```asm
branch1: # @branch1
        retq
branch2:                                # @branch2
        retq
start:                                   # @start
        cmpb    $1, %dil
        je      branch1 # TAILCALL
        movzbl  %dil, %eax
        cmpl    $2, %eax
        jmp     branch2                         # TAILCALL
```

As expected, it creates a single test that `%dil == 1`, with a conditional jump to `branch1` and an unconditional jump to `branch2` otherwise. Oddly, it appears to create totally superfluous `movzbl + cmpl` instructions; this is probably related to the next bug we're about to see.

# Version 2

Consider this variation. The only change is that instead of branching like this: `1 => branch1(), 2 | 3 => branch2()`, we branch like this: `1 | 3 => branch1(), 2 => branch2()`:

```llvm
define internal fastcc void @branch1() unnamed_addr #0 {
start:
  ret void
}

define internal fastcc void @branch2() unnamed_addr #0 {
start:
  ret void
}

define void @start(i8 noundef %0) unnamed_addr #1 {
start:
  switch i8 %0, label %bb2 [
    i8 1, label %bb3
    i8 2, label %bb4
    i8 3, label %bb1
  ]

bb2:
  unreachable

bb3:
  tail call fastcc void @branch1()
  br label %bb5

bb4:
  tail call fastcc void @branch2()
  br label %bb5

bb1:
  tail call fastcc void @branch1()
  br label %bb5

bb5:
  ret void
}
```

On the same versions of LLVM as above, this produces this (worse) assembly:

```asm
branch1:                                # @branch1
        retq
branch2:                                # @branch2
        retq
start:                                  # @start
        cmpb    $1, %dil
        je      branch1 # TAILCALL
        movzbl  %dil, %eax
        cmpl    $2, %eax
        jne     branch1                         # TAILCALL
 jmp     branch2                         # TAILCALL
```

Instead of a single conditional jump, we now have two conditionals and two conditional jumps, even though the entire function could be reduced to a single conditional jump on `%dil == 2`. This appears to be the source of the superfluous instructions up in Version 1.

For context, this bug was discovered when exploring different optimizer behaviors related to Rust enums. The LLVM IR used above is based on the compiled output of Rust code approximately resembling this:

```rust
pub enum Value {
    A = 1,
    B = 2,
    C = 3,
}

pub fn start(value: Value) {
 match value {
        Value::A => thing1(),
        Value::B => thing1(),
        Value::C => thing2(),
 }
}

#[inline(never)] fn thing1() { std::hint::black_box(1); }
#[inline(never)] fn thing2() { std::hint::black_box(2); }
```

This bug appears in LLVM 17.0.1, and in the current trunk. This bug also appears in LLVM 16 and 15, but those versions generate less optimal assembly in general (several additional unecessary unconditional jumps resembling `.LBB2_2: jmp branch1`). The assembly was produced with godbolt using `-O3` ([example](https://godbolt.org/z/Ej6nohq6n)), and I was also able to reproduce this bug using rust code in the [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=7c89158202218a3586a0b0c17cfd3271) and on my local ARM mac, using the same rust code.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWN9v2zgS_muYl0ENiYps58EPsb3BBUjRQ1F0HwtKGltMKVJLUnbcv_4wpGxLTtwfi1vgDrtGEdci55shOTPfRwnn5FYjLli-ZPn6RnS-Nnbx1JUWvUR1U5jqsPi9Rg2laVqppN6CgOzdXhzA7aUva9hLXwOH1jgnC4VQofNSCy-NdiB0BUJDpy2KshZxfCM65aFQpvzK-Aqenj6_h0Y6hw6caRBM62UjvwUIMG1rrO-09BIjHjbSO3Bdi3ajOtM5kNp525XB5YQla5bc9395Bp_ROgJKhwMro52s0IKvpYPHjyy7H9lNk_hPqV0TH1W4kRpBao9WCwUb4XxZws7ICthtUlihyzplfM74HXRaiwarL6KqLDCeJcBmy4jjvLD-5A7Aog8gvePZehjHTznlf43To5Noy-dyDtp0usINMJ4nb3lMr3vsk0XOe-MVKFGgol9FwYHyr58JNCm9mJGNRvnF6O1oNLsYTY-jLB-tsyj4OcBBho7nZINFeCEVlEJ9__SPsws7jCIfw97-PCz_Bdj0r4HNfyJ3jiUztPwXWqTT2CPUYocgYNPpUKfApn1mTRPwtfB9hqADs0MLVOLCbrsGtScEWgc1Hza92OxpAnIDvsbTdJAu5g81i9N8PpovPeMzBxyMhWwCvyMI59D6GIrxNdpzRzsCU0Q4zJQJfNCxfaWzSTJJg0cKpeyspUi87fRXEA7MJnaavZVe6i1F9_ix76nowJs4TEE0hTpca0fC9d3ouAfZPdXdMAVPlUAfi_6PoQElPPzgM8Lj1_CO9f0jtDNgtBjBlU1bxBm34cAYzyupxnOeMX736wton-4fn1b3T0_jmY3ZfSsUHFEiHoqXVz5V75Nfm_PctAOf_LsrG8fyZhXcO8CXFkuPFXmUHkqLwhOdgZN6qxA8uj73yDYsAFi2Ztka0vBkFXlWQGl0JamChILnrmkpdwZFMU3OnPvdqZymhkTfS0eZXFXq0Icn2haFDWkZIwVvvFDqMGJdNk36LWd8GfY1FNeAi1m2jHktHbTWFKJQB7CohMcq5jyCxhcPRbeFPTI-swiiMJ2nUYd4lcz5dTLfCSuDdJjApxrBaHWAshZ6ixRG2GOKEUVFVRk3gzqLkl8xIISimiZpPIDfYNxv-Ao4sNkKsvH4ub_0DS8-fgv2tfEY_ArsPxLl7y1R_oYa5Vei_bMa5YMOfciJBmEX20vg68DrwlE_2gURE7pLa03VlUTZ9Ivx-d5Yh5Rpf4a7f4WK_wep_f-Q2TWOfH5vZRex_JdFweOZg04q4JKweybRZh_ls9-b4Zx4Hb54GAwdWeIOKbVNt61DhqP20uJZgZemUxUUCBYppQMjXw0FomQfSxPSEMSypFzPkqHAWFCmsyVG5YtXr-vQtSD1-ZI-YvwHYykQjy_-VIBBKggHlXQl3RSwgn2NmhSWMpaIvJKbDQb53b9IQAsF1mInjXVD-fGxcx5Qd42LUiFU_ONH6BxWsexJMRSCfprYJXrFXoHpfNt5Wl1AKU2FtAfWvMhGeAw6J_QDiqin_zfbgu1cXzttV4Ro4LNQHZ55ipLqHqIU5KvzsyXEMxg-W4Vn2enZBW-Si42GI1_uyBFVevBITezstBFEhrvXodDnc2_Isvv7o17xpKLOQuaqwfJXDVYjA35pcF7ieK2MZyxfSq2kRsbnGndoyTBf0xYMndP6wPkququl9vF_hRLl1y-FeWF8npJpthx4-zE8_3l4_gr-rZ7x6VgAx2qTo-vn8c4r9etLaF-mwVg58xphGkzTnEAKEuC1cQNC3KJGSzcBhc7FwhLqxHkEE2cQN88dbQYNV6ce0mks0TlhD29cTNywWKinPC2X_EtgNOq657sN43exVE-OqRX0nFzFK9LWVIVRHjrXo737kNHFhA4jX-KLaFqFpLP4vPa-DYXJHxh_6A0nxm4Zf_jG-MNvz1Nt6j-mOiTc3XF_H4PXuI0FXd4MWOyDODep6N-eukN_Kixfho7xbyUOW0tq9a1YWiUOE7J9p4Te9iGx7KE_D5atnQ86kE8bUyHL1hYVClIiU4x7y7I1T3jK-HQrnWfZelbO79J8zhPO07nI8vlUJEVSprNyU2V8lgYRo0Oraw6gTCkU3H98D40oaeVxPSepdFrY5KZaZNVddiducJHOkjSbZ7e3s5t6gXk-28yKqkhmSVHms6zkSXWLeZFtsk2R5jdywROepWmWcJ4lt7MJJrwqkzKrMj6lP-w2wUZINaFLDu3CjXSuw8Usu7u7uwla0IV315xr3EMYZJyzfH1jF2Tzrui2jt0mSjrvziheeoWL99KF1j543ex6MqG-J_y1t91EuW--777prFpcZJX0dVdMStMw_hDuavHrXWvNM5ae8YcQtmP8ISzrPwEAAP__t0vdSA">