<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/65136>65136</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [libc++] `std::cmp_less` and other integer comparison functions could be improved
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Eisenwave
      </td>
    </tr>
</table>

<pre>
    The current implementation of the safe comparison functions is less not optimal:
https://github.com/llvm/llvm-project/blob/22044f0bde015f4bf53fca24831d030ff96efc51/libcxx/include/__utility/cmp.h#L65-L75

For two maximally wide integers (i.e. `unsigned long`), this is the best you can do. However, if a wider integer type is available, then all comparisons can be made safe by converting to the wider integer.

For example, a comparison `unsigned int < signed int` could be implemented through `long long < long long`, which is either zero-cost, or involves a sign extension for one of the operands. Indeed, if we compare `std::cmp_less` to such an extending implementation:
```cpp
bool std_less(unsigned int a, signed int b) { return std::cmp_less(a, b); }
bool ext_less(unsigned int a, signed int b) { return static_cast<long long>(a) < static_cast<long long>(b); }
```
The output for `ext_less` is slightly better, although it's not strikingly obvious. The additional `mov` and `setl` in `ext_less` are artifacts of the ABI, and `movsxd` could also disappear through inlining.
```asm
std_less(unsigned int, int):
        test    esi, esi
        setns   cl
        cmp     edi, esi
        setb al
        and     al, cl
        ret
ext_less(unsigned int, int):
        mov     eax, edi
        movsxd  rcx, esi
        cmp rax, rcx
        setl    al
        ret
```
When we add `__builtin_assume(a <= INT_MAX);` to both functions, the difference becomes more striking. `std_less` is completely unchanged, but `ext_less` becomes:
```asm
ext_less(unsigned int, int):
        cmp     edi, esi
        setl    al
        ret
```
This would turn `std::cmp_less` into a true zero-overhead utility, since LLVM could now always fall back onto a simple `cmp` instruction if it knows that the `unsigned` operand is restricted in its value range (which is quite plausible).

### Proposed Change
```diff
template<__is_safe_integral_cmp _Tp, __is_safe_integral_cmp _Up>
_LIBCPP_INLINE_VISIBILITY constexpr
bool cmp_less(_Tp __t, _Up __u) noexcept
{
+  if constexpr (sizeof(_Tp) <= sizeof(long long) && sizeof(_Up) <= sizeof(long long))
+    return static_cast<long long>(__t) < static_cast<long long>(__u);
-  if constexpr (is_signed_v<_Tp> == is_signed_v<_Up>)
+  else if constexpr (is_signed_v<_Tp> == is_signed_v<_Up>)
    return __t < __u;
  else if constexpr (is_signed_v<_Tp>)
    return __t < 0 ? true : make_unsigned_t<_Tp>(__t) < __u;
  else
    return __u < 0 ? false : __t < make_unsigned_t<_Up>(__u);
}
```
The same style of change could be applied to the `std::cmp_equal`.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysV1tv47oR_jX0y2ANWZIV-8EPcbJGDeQsFmj2tH0SKHJksUuROiTly_76Yijfk2x3iwZCZIni3L6P35Dce7UxiAs2XbLp84j3obFu8Vl5NDu-xVFl5WHx2iCI3jk0AVTbaWzRBB6UNWBrCA2C5zWCsG3HnfLWQN0bQeMelAeN3oOxAWwXVMs1yx5Z8sySxyaEztNTumLpaqNC01djYVuWrrTenm6fOmf_jSKwdFVpW7F0laZJntdJJTGZTOu8qqdZLXiaz7KJTLKkrucF1mI6IQOqEvs9S1fKCN1LZOmqLPugtAoHlq5E240blmYvxfTTy8N0CGv4v7IOws5Cy_cUtD7ATkkEZQJu0Hlg6UyNcQysSHoTqyhBW7NhRcLSOUufIDQq5k8FqtAHONgeBDcg7Rj-Zne4RUffqRp4NO5O1iEcOqSpfMuV5pXGwR4a4FpfFdpHexVCy-URheoAwpotuqDMBoKN7m-sj-_TxD0nVMkHv0bxOjVlArDsCS6PrEhA2F5L8n-mBUoIjbP9pqHpVJBYlTj3_BRr9AS7RomG0kQVGnTwA539JKwPNGgp3q3VW_TAo1vAfUDjiXa1dWANnuhnO3TcSD-GtZGI8ljV3YmTSLH4IIlr2aNou5I4SQkEC74XDfCjdUlFu-X4ma4UdbxE1w1vKms1-CAHc-nsplqcorh6rlg6B_awBIehdwbexpPO4hz6kGVLYA_PV25wH_5HNzwoUQruA8ueLhBknwd_8wHWn331JqBzJYZH0gfbh64PERhWJOdgi4Tw9VptmqAPUGEIA-m5Dk1kiQosfRj0wQenviuz0Qew1VbZ3o-BbHMpFSHBNdlu7ZbMciMjqhh09GLu_RLs3AVVcxH8iSmPy3X0Pkxu7dbv5YXIXHsLUnnedcjdmcjKaGWU2Yzvsue-Hd58QIFIQ7rNzxyC418gPQAA9Iq-otvNuMdgPAAIfftetF28o_xwXgX8bhJlG--a5tybdBiGFx8w7CdZtHY7RMP3MRqp3oz7vQRwYv9utJSNG-bSJ_eZ6GPQ74d7R8J_kDruIlkI2rKseqWDMiX3vm-RyE5MZ9kzrL-8ln88_nNg9VEFKhuaS9s6yi1IVdfo0AhScGFb9NBah2emjo_Cck12khyNAfUBeiMabjaDIFV9uKfo0eZbhTkz6_cx-QWG_FZdX6mL7eLyiHrykZIqEyxwCK7HQcjtFl2DXMK53ZJOUSlfXv7847jijN0B1zt-8FBTa6u4-A52MOWjDpND0XaDCx9cHxEieVcBvhu7owbLQ4TrqmHR98e2QKA4JMhEiOUDFTxsue4RHMFDvfzci_7qVUDoNO-9io13ftMvWZoNF3x1trMeJTxFjO-KR8wZXgVsO80DsuypLJUvqUuXsRU7rkuCq3ztqDgfjX7rSIajrfJlvXz6-rVcf3lZf_lc_rn--3q5flm__ot6vg-479xVy7hqLOVrB2UZeVN-o589Kb-xuBfYnYB_WJ5yXAIV-GyTCuTVD7T1YOrYNWgtnV9fegYNpgVLi8sg5fBf5tB18Q6_1L5iRr_QwIZ0abFHD5_eZkeVj7wpt4TTK5UcWPZM4d6NDXDcRIva4__P5FX2ZTnsuiiBU_S_4e5nJhNg2WpYrix7hJZ_x_K0eMpwZeK6ym_jeMd-f2W_5hQrOTj5fcfRt-5dmH623fC8JRU-6LgHHFT2shvlXacV7UTtSRZuJAv_6jltGsYjucjkPJvzES4mxTzLZ-kkn46axazAqpoVk7qo8jTPBa_ztCiyKqkn03Q2kSO1SJM0S2bZZDLJ03Q6rrI6z3NZCD7J5YynLE-w5UqP6Qgztm4zUt73uCimk6wYaV6h9qdTl1vEc07VbzzLE6188JdpQQUdz2d0lGHpkq7p84cyTHpn4276dJZ491h2vXF3doty1Du9-O0DWczJs3QV0_pPAAAA___3JXZU">