<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/94157>94157</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Failure to eliminate branch that is impossible due to `assume`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
missed-optimization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Kmeakin
</td>
</tr>
</table>
<pre>
These two rust functions both compute the number of bytes in a UTF-8 codepoint from the leading byte (according to [this table](https://en.wikipedia.org/wiki/UTF-8#Codepage_layout) from wikipedia).
Continuation bytes (`0x80..=0xBF`), overlong encodings (`0xC0..=0xC1`) or bytes that would signal an out of range codepoint (`0xF5..=0xFF`) are assume impossible.
The `_ => 0` branch in `src` should be unreachable due to the `assert_unchecked` at the start of the function, but it is not removed.
[godbolt](https://godbolt.org/z/MdzejKfvY), [alive](https://alive2.llvm.org/ce/z/r9PvMK)
```rust
#![no_std]
#![feature(hint_assert_unchecked)]
use core::hint::assert_unchecked;
use core::hint::unreachable_unchecked;
#[no_mangle]
pub unsafe fn src(c: u8) -> usize {
assert_unchecked(!matches!(c, 0x80..=0xBF | 0xC0..=0xC1 | 0xF5..=0xFF));
match c {
0x00..=0x7F => 1,
0xC2..=0xDF => 2,
0xE0..=0xEF => 3,
0xF0..=0xF4 => 4,
_ => 0,
}
}
#[no_mangle]
pub unsafe fn tgt(c: u8) -> usize {
match c {
0x00..=0x7F => 1,
// overlong encoding
0xC0..=0xC1 => unsafe { unreachable_unchecked() },
// continuation bytes
0x80..=0xBF => unsafe { unreachable_unchecked() },
0xC2..=0xDF => 2,
0xE0..=0xEF => 3,
0xF0..=0xF4 => 4,
// codepoint too high
0xF5..=0xFF => unsafe { unreachable_unchecked() },
}
}
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysVktv4zgM_jXMhWhgy3GcHHxokvGlGGAP3cOeCllmbE1tKdAjffz6hfzIowlmdna3KNyaIj--Porm1spaEeWQbiDdzbh3jTb5U0f8VapZqauP_LkhS-jeNBpvHe69Ek5qZbHUrkGhu4N3hK4hVL4ryaDeY_nhyKJUyPHP5-JhhUJXdNBSOdwb3fXaLfFKqrrXRWArLoQ2vcRphHTjGmnR8bIlSHfAVo1zBwvJI7ACWEFq_iZf5YEqyefa1MCK8A6s6B0CS7bBJa_ppeUf2jtg68H3yQzYeg7RDqLH4bnVyknlechuzADYCpZR9L6K5nNIdtH7poBlBGwNbIv6SKbVqkZSQofAz_rbSX8bD_qozQjpGu7wTfu2wlB73iJXqL0LZTNc1XRRqwmuSEe4YnSP3BBya31HKLuDtlaWLV1l89wQwjJ6QUh2kHzDCJYRloYr0YTGwDKyRgSZbfpgSkKvDHHRhJJj5Sn0wQ0o3Foy7sUr0ZB4pSrYcdefWsdNH3x4mcgRylN6h9KhtKi0Q0OdPlJ1FSKkm1pXpW7dvQ6PR2NzP4EV36tP-vG0P_41NgDSDW_l8S4_-gM2b9tjNyIIGmHM-o_j96eAMUSxjIbfQO9RxBJgMaQbpV-sqwL-tXxP3HlDwatU7uWmPGx9tumf3oa-GgrxJY_BaPjvxjLZ_Nzgokd3rM5h9rF3XNXD-PTygy_RK8v3hHuFof1sJSB5RL8KnHoINPFWfhJCNgIiIt5mtwIWd9yJhmyoR0BhW7waE4QsSC4GYZRccjm0cf0l-OCxx0ZxHUb4id6jCTArJmbHwLZf1bZsVNud1NgdtW8T2reTWnJWi96L6bxYTOeL0_k12MWgnQAgm0iQ7X6_Qa52_6BB_7ZW18rD0NxeaTd1veznADkGDNkG75OT9cGHEvzUt7i5fr86v6LX_-T8V1T5FUdOML9BlVPC0z3vtMZG1s0N5nlW_kPC92g43XmzKk-qdbLmM8rjLM6SJFosslmTr-OURLmkshKxWFRVllXLkpXJKmVllSZsJnMWsUW0jFjM4kWUzWmRZqsyEft4FVdxmcAioo7L9nQHz6S1nvL1Ik6zWctLam3_3cFYJ62l6kEfnOzkJx82CAsfJCYP1g-lry0solZaZ894TrqW8oLL1pt-W1ErO6m4o2nR9ctW2osdOS22Yan5jkIRvGnzL8tHusaXc6E7YEVwOP55OBj9g4QDVvTZWGDFkNAxZ38HAAD__3XZqGI">