<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/113965>113965</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [x86-64] Avoid usage of multi-uop CMOVBE/CMOVNBE 
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          daniel-zabawa
      </td>
    </tr>
</table>

<pre>
    The CMOVBE/CMOVNBE instructions generate 2 uops and have a throughput of 1 for P-cores. Other CMOVs are a single uop with a throughput of 2.

The following case shows the backend generating the more expensive CMOVBE/CMOVA instructions:

```
//  file f.c
int f(int x) {
    if (x < 2)
      return x;
    long long int l = 1;
    long long int u = x;
    do {
      long long int m = (l + u) >> 1;
      if (m*m > x) u=m; else l=m;
    } while (l+1 < u);
 return (int)l;
}
```

Compiling the above with trunk as `clang -O2 -march=core-avx2 -S f.c` generates:

```
f(int):
        mov eax, edi
        cmp     edi, 2
        jl      .LBB0_3
 mov     ecx, eax
        mov     eax, 1
        mov     rdx, rcx
.LBB0_2:
        lea     rsi, [rdx + rax]
        sar rsi
        mov     rdi, rsi
        imul    rdi, rsi
        cmp rdi, rcx
        cmovbe  rax, rsi
        cmova   rdx, rsi
 lea     rsi, [rax + 1]
        cmp     rsi, rdx
        jl .LBB0_2
.LBB0_3:
        ret
```

The `cmovge` and `cmovl` instructions should be preferred to these where possible.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyEVd2SsjgTvpr2pAsLgqgccCCjHn3fzlbt1p5uBWgg7xuIlQRk9-q3EtAZdXbWshKTp3-e_jHNjRFNT5RBkkNyXPHBtkpnFe8FyeBvXvArXxWq-iv7vSV8-__7H_kJ2Nn9-CU_oeiN1UNpheoNNtST5paQ4aAuBnlfYctHQo621Wpo2stgUdUYYa00_hqUSpNZ47ttSXvbBrl24kb0jSRnBa_Cti8G2BrCI4SHeXXMaiWluoq-wZIbQtOqq0HbEha8_El9dSPnJNx1pzQhTRfqjRifAjs8hAXx4bMz2IbLdz6yM7AzYi0kYb0u51vRW6yB7d0-AUsRdvmMICKKGoHtJ4T4DRmw9ANB1GQH3eME8ScFqfpmXpxBiRAfMfpGYvASjzYq9UjiWafzOsD2EoHlOHjW8Qni05OrG_8O2MEpneYIB4iPHcQ5kjSEcjl9qMHuiNfWZcn5AJZHPn7n50NuiX7OHLBU3iHYHb_Ov1_fVHcR8lZbXqiR5saxeuh_IjcI27CUvG8weGcYdFyXLcRH14ABHyeGwW--etvw3sX_Ufj6TvIuh8unUyMSn4C9IVXiESu7i98dwFz5H9Afct7X_8vz8M94AZ09r1PONvn06s_js8_oa1RXHtXloj37YC_sJfFZ3niGkOS6mnxPaD5BcnyUNlx7yX9x6U284KIb5He4S9INK6dnTI0FoSfzta4a-adw7_gXcfE5ruglqluVFlln67lOt_R9ymX8kktN9puude-W68tOjQ251nMP5nIh3fnhdTWtGmSFBeFFU01aU4VWuYY3hNeWNOFFGSMKSetVlcVVGqd8RVm0i8MoifYxW7VZEvK0Dus63VQs4SUrqyrl9S7dFhuelOluJTIWsk0Usj2LNnHC1mxfJ_uk3vAo5skujWATUseFXEs5dmulm5UwZqAsiuJ0m6wkL0gaP00Y6-mKHgXG3HDRmVMKiqExsAmlMNZ8mLHCSj-Gpv022G4gOeJhVKLCwfCG3KvfDdKKwA2Flzm0GrTMWmsv_l_r3-RG2HYo1qXqgJ2dl2ULLlr9oNICO3tuBth5IT9m7J8AAAD__7vO-bQ">