<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/86317>86317</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Missing chance to avoid unesessary branching
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
christianparpart
</td>
</tr>
</table>
<pre>
Hi,
the following minimal example is a strip-out of my own code-base:
```cpp
State make_state1(char const* input/*, char const* end*/) noexcept
{
if (*input < 0x20)
return State::C0;
if (*input & 0x80)
return State::ComplexUnicode;
return State::USASCII;
}
```
```asm
xor eax, eax
cmp byte ptr [rdi], 32
setge al
inc eax
ret
```
This used to generate a CMOV in my own code-base, so this is not entirely how it looked for me, but it remains branchless at least :)
I then extended it to also take care of another condition `input == end`:
```cpp
State make_state2(char const* input, char const* end)
{
if (input == end)
return State::End;
if (*input < 0x20)
return State::C0;
if (*input & 0x80)
return State::ComplexUnicode;
return State::USASCII;
}
```
```asm
xor eax, eax
cmp rdi, rsi
je .LBB1_2
cmp byte ptr [rdi], 32
setge al
inc eax
.LBB1_2:
ret
```
Now the code is sadly using branches, which I tried to avoid (hhoott-path in my own codebase).
Rewriting the code with `CMOV` in mind:
```cpp
State make_state3(char const* input, char const* end)
{
State s = State::USASCII;
if (input == end)
s = State::End;
if (*input < 0x20)
s = State::C0;
if (*input & 0x80)
s = State::ComplexUnicode;
return s;
}
```
```asm
xor eax, eax
cmp rdi, rsi
setne al
add eax, eax
movzx ecx, byte ptr [rdi]
cmp cl, 32
mov edx, 1
cmovge edx, eax
test cl, cl
mov eax, 3
cmovns eax, edx
ret
```
### Minimal Link to Godbolt
https://godbolt.org/z/9xnbE34P8
### My less-minimal test case to Godbolt:
https://godbolt.org/z/PKhsbM3Mv
## Motivation
It clealy seems to be a missing optimization opportunity for case 2 to yield an assembly that looks like case 3. It would be really nice if LLVM IR optimizations could be adapted to take care of these.
Many thanks for this great product,
Cheers.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzkVkFz2zYT_TXQZccaEpRk6aCDLEff56nVZpIm1wxIrEXEIMDBLiUxv74DUnKkyLHrttNLPRySBhfvrYD3sKuIzMYhzsX4RoxvB6rh0od5UQZDbJSrVahV4EHudTv_vxFyKZJbkSz6O5cID95avzNuA5VxplIWcK-q2iIYAgXEwdRXvmHwD1C14HcOCq_xKleEIlucwolJ0l9FXfcjH1kxQqUe8QvF11TIaVGqAIV3xEIuwLi6YSFXQi6EXML5R3S6G18JOQPncV9gzQeu65v-BQDAPICQUyEXHRqIbAnJXiZCzg4xAbkJDrp0YtLZYpmI7AUEOYFkPz1BOPw9A-TjYu0_OROX5Qz0MvjTx8XH5d3dU5S4vv1h6Z5dT0XVEXXvA6Dax7WKj8NoUdWQt4xQcwAxvgnaiPFtDMrkMYaQNwjKHv83rjiFCMgvpPJ7aQgaQg3sYYMOQ9xYBcv1b5_BuEtlyCWQB47TDIHzDOjYBLQtlH4HhsF6_4gaHnyAqovPG47jAStlHEEelCtKi0SgGCwqiju7eNqS_n4HXKID3DM6jToCsAdlI7l6RChUwChd5TyX2GlLGzbegZgkR7nciuy2E9skebOk5c8k_ayYZz9V70Uur0vvndMvivg_aYMofbmEQOY4-hVheH9zk36R_5hdjnhHsbxmn1_9Luq080f0AyltW2goHru9zpEi_a40RQl3wMH0VlNbb3Tck7L0nvmqVlz-4LfebrPhKd8H3AXDEf2JdWe4jJqPjhWTpAMxUUBv1Hv2t_XeI1LU-kuieIM1LrDe7IwLhL9ojEucV31BIrsB-DeMQMjuVNJK62fmVn77bQ9YdOPPuOSEorDnnqn8FlB3E9PvcX67wePwCQ8j8QGhsGcIfUrZKYKjp0z1nyxZQmb9BetDY3Nv3GM01f-8zr3l0-CSuaauuqyEXG36gKEPGyFX34RczfYuf5eN3k9_wtBCLFRXxxaq_22K8JTu3GmvMb7_paR8na23l4yw9my2Klaxs1IYlxOVbYEQK4rUeazRlaHuoPE1m8p86-aBr2sfuHGG264Ed8nKOKc1aDUoB4oIq9y2wKXqyzWBNV1NJYRsCHcMO99YHWkCKmtbcKbAaJb7-89ruPtwxklQHKOVVjX3J9xZleYSCc9OsrVyXQLukbo8u45iE1Ax1MHrpuCnnnZZIgYaDvQ807NspgY4T6_TNM2S0fVkUM5zpfR4qvE6G0mFepTOkgQfimKWqDxJp-nAzGUiR0kmZTpN0nQ8lLO8eBjpIi3y2XQ6zsUoic2JHVq7reJeDQxRg_PpJEuvB1blaKnrxKV0uIPuo5AyNuZhHudc5c2GxCixhpi-o7Bhi_P1YZ-KUrkCv5_-jUNCIhXaQ60wbjNogp3_ICHDZZMPC18JuYrQh8dVHfxXjOu06hIiIVddwn8EAAD__17Djt0">