<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/86873>86873</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Failure to convert branchy code to branchless
</td>
</tr>
<tr>
<th>Labels</th>
<td>
missed-optimization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Kmeakin
</td>
</tr>
</table>
<pre>
https://godbolt.org/z/cn3d5fGs7
Consider these 3 identical functions for computing the length of a UTF8 codepoint from the leading byte:
```rust
#[no_mangle]
fn len_utf8_match(c: u8) -> usize {
match c {
0x00..=0x7F => 1,
0xC0..=0xDF => 2,
0xE0..=0xEF => 3,
_ => 4,
}
}
#[no_mangle]
fn len_utf8_branchless(c: u8) -> usize {
let mut ret = 1;
if (c & 0b1100_0000) == 0b1100_0000 {
ret = 2;
}
if (c & 0b1110_0000) == 0b1110_0000 {
ret = 3;
}
if (c & 0b1111_0000) == 0b1111_0000 {
ret = 4;
}
ret
}
#[no_mangle]
fn len_utf8_branchy(c: u8) -> usize {
if (c & 0b1111_0000) == 0b1111_0000 {
return 4;
}
if (c & 0b1110_0000) == 0b1110_0000 {
return 3;
}
if (c & 0b1100_0000) == 0b1100_0000 {
return 2;
}
1
}
```
For aarch64, `len_utf8_branchless` is the clear winner, for x86_64 and RISCV-64, I think the best results are from `len_utf8_branchless` and `len_utf8_branchy`.
In any case, `len_utf8_branchless` and `len_utf8_branchy` are equivalent, so identical assembly should be produced for both
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJykVVtP6zoT_TXuy6iVY7dJeMgDUPIJfW_n9lrZzqTx2Y7d4wuH8uuPnLbQDYUNoqqSaLy81lzsGRGC3lrEhqxuyGo9EykOzjf_H1H80HYmXbdvhhh3gfBrwlrC2q3rpDNx4fyWsPaJsFZZ3q36_4WK0DWh14fnrbNBd-ghDhgQOOgObdRKGOiTVVE7G6B3HpQbdylqu81IMGi3cQDXg4A__2hrUK7DndM2Qu_deMSILuPlPmJ266Ba0sPfpxCPJsbJ6sa6zSjs1iBZrQ_23maZTYp9vRlFVANhtSL8GlJN2BXMCb-DFPQTAqluDlsAACYoqJ-N-UcfKV0sCF_Tx6oFwteZoCDs9jXs9gRbP8PYBdjdCXb3DONvYJvT0vJliVTHCF8-Pp0I6YVVg8EQPpENgxHGFMFjzG5AQfhZTnQPmQMIK4HKoqB0QymlmW3yeX1ufZvOEyn7ifQ5pEsKxUWF4lcK_AsKxUWF4hcKy_cVPMZvFmv_uXP7_VCStx9F8v1qZIEvFOPrByoLfHCeiteVODWT88K0zoMQXg1lvnJASnrp8pQUdJjalDIoPPyrrUWfN-Rm91iXm3IJwnbw2_3vt3_ND1z3EAdtf0zbJIZ8sUIyMYDweOh776tlrrere1LSxbn39xaE3YMSAT_2_n2-yRv8J-kHYdDGTBPcWV8XIeAozR7C4JLpQCLsvOuSwm4KXro4HJyZdQ3vrviVmGFTVEWxqpZFXc2GpsSy5lL2XcWx4zUt-iWWVwXvlKRSKTnTDaNsSTmrWFGUrFr0HJVYloxjL2SpGFlSHIU2C2MexjykZjqEhE1d1hWfGSHRhGnYMTbqELCbu13Uo34SeSQRxvIU9E3ePZdpG8iSGh1ieOGLOhpsWqFN8gjRgXL2AX2EY6amkZXtL3mdJW9ej1EdhyQXyo2EtZn7-JrvvPsbVSSsnRwPhLWT7_8FAAD__x7ZF_w">