<table border="1" cellspacing="0" cellpadding="8">

    <tr>

        <th>Issue</th>

        <td>

            <a href=https://github.com/llvm/llvm-project/issues/108418>108418</a>

        </td>

    </tr>

    <tr>

        <th>Summary</th>

        <td>

            Comparing `__uint128_t`s generates significantly better code than comparing pairs of `uint64_t`s

        </td>

    </tr>

    <tr>

      <th>Labels</th>

      <td>

      </td>

    </tr>

    <tr>

      <th>Assignees</th>

      <td>

      </td>

    </tr>

    <tr>

      <th>Reporter</th>

      <td>

          pkasting

      </td>

    </tr>

</table>

<pre>

    See https://godbolt.org/z/1ajsxs8n5 for examples.

On x86-64 -O2, any implementation of comparing pairs of `uint64_t`s seems to generate noticeably worse code (along all axes) compared to using single `__uint128_t`s. Seems like the former could be optimized into a form close to the latter in at least many cases.

</pre>

<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyMkk1u3DAMhU8jb4gxbPkn1kKLNMFss8gBAtrmeJTIkiHS6UxOX2iSpkC76YYCBJJPT99DZrcEIqu6H6p7LHCXc0x2e0MWF5ZijPPVPhPBWWRj1dwrfVT6uMR5jF7KmBaljx9KH2t85QsPoYNTTEAXXDdPXKrqUVX3n_UpwGXoD30Lhyet9ANguILLfSsFQXExQDzBFNcNkwsLbOgS5yvVV7sL0rcvovqKgYlWBomwUKCEQhCiuIlw9Ff4GRMTTHEmUHpAH8MC6D3ghVhp87Wf5jy_c9bJxVNWeXnJOrUePoVKeL4pefdGIGfK3lZKMMXdzzASxE3c6j5oBhckAt4aYPKRKa_PIx5FKIELgAKekAXWbHxCJi6L2TazaQwWZOs73dfdcGd0cbZ9bzTVd017mruq6oyhHnEcjDYNdafJFM7qSreVqXXdN02ty1F3Wpuh6bqu6bEzqq1oRedL79_XTKpwzDvZuhraeig8juT5N_dkc9dh3BdWbeUdC_-ZEyee7MM3l38_6hsEQ86TO7kJg_grjHRzf6MhZwz_QbfYk7d_xc3JeR_LKa5KH_Ozvo7DluIrTaL08eaNcxA_7b1b_SsAAP__3brxog">