<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/63709>63709</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Suboptimal codegen when doing 128 bits multiply
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
moncefmechri
</td>
</tr>
</table>
<pre>
https://godbolt.org/z/9PPPeqjTK
Codegen for the following code snippet ([which is a mixing step in boost.Unordered when a non-avalanching hash function is being used](https://github.com/boostorg/unordered/blob/9a7d1d336aaa73ad8e5f7c07bdb81b2e793f8d93/include/boost/unordered/detail/mulx.hpp#L111)) seems suboptimal:
```
#include <stdint.h>
uint64_t mulx64(uint64_t x)
{
__uint128_t r = (__uint128_t)x * 0x9E3779B97F4A7C15ull;
return (uint64_t)r ^ (uint64_t)( r >> 64 );
}
```
I believe the optimal codegen should be:
```
mulx64(unsigned long):
movabs rax, -7046029254386353131
mul rdi
xor rax, rdx
ret
```
Which GCC <= 10 is able to generate (GCC >= 11 seems to regress, [which has already been reported](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110551)
Issue 1:
When compiling with any optimization level above `-O0`, clang generates the following code:
```
mulx64(unsigned long): # @mulx64(unsigned long)
mov rax, rdi
movabs rcx, -7046029254386353131
mul rcx
xor rax, rdx
ret
```
Which has a redundant move. Possibly a duplicate of [#62452](https://github.com/llvm/llvm-project/issues/62452)
Issue 2:
When using `-march=haswell` or newer, clang emits mulx. The resulting code is longer (by 1 instruction) with no clear benefit to my untrained eyes. It looks to me like the optimal codegen shared above should also be optimal for haswell and newer.
I am reporting both issues in the same bug report because they seem related enough. Let me know if you want me to split them into 2 bug reports instead.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJycVtmO27gS_Rr6pdCCRFrbgx9680VwA9wGbgZ5bJBiWWJCkR4ubTtfPyDlXuJ0MpMxGlabKtV2zqkS916NBnFD6htS3614DJN1m9maAXczDpNTK2HlaTOFsPeEXRO6JXQ7WimsDoV1I6Hbb4Ru-4eHB_zzy6f_kvKOlNfL962VOKKBnXUQJoSd1doelBlhsBLBG7XfYwBCO1LfHCY1TKA8cJjVMRn5gHtQBoS1PhR_GOskOpRwmNAAB2PNFX_impthSuYT9xPsohmCsiY5EpiOo0dJ6jtCu4saVJiiKAY7E7rNIZZq4nOcdKytSMXxVlaSsYZz3jIuO6x37VC2QoquEhTbnu062TNCt8oMOkp8dnnhT2LgShO6naM-FtN-Tyj7WFUVoT2hPXjE2YOPwu6DmrlOub5pJ2nK89_yk7JzNCDs1gepTCgmwu7fPhOVCc36MUCK2KwJ7V5OjinqYtreLP8AADw-JouKdo8BHBB2l_B5c0hofwRCr6E89vesbfubvt2ur9vbqo5aE_bGlcMQnYE3QQntHZD6_uKM0C7HuifsHpo1pKNnR6S9e7_8_P0BBGqFT5gJdu5bplcinp9s1BIE_rqTr70xWQ4StDVjTuL6tZr0me0TFx7A8SOht3DVluumpD2t16xrWM0qVl08EHW-Oqm-v3G0brmxeHLyeL7vMPyi3s9ZJf-5vU2YJ3CqMmtGaIRgYUSDjgdM_V2M7rNRdeZWsOBwdOh9Cvqiuol74NohlycQiAYc7q0LPxHOMBSjiWfxizh-U1pzQrd-sodHEcdiGBVhWyUJu6uqsq6rV6YtoHkfEaoLUD4nWQ923iuddHtQYQJuTguo6hvPstb4hBq4sE8IpCmv_pcbRG9h0NyML_X7d-bNv-cAEMqArMuf21yS5Hton6F_Yc_wu-wZjj-y5wfmnD__hEAZcHAoo5HchJQZFvBgvVdCn4CDjHuthsQku0s8IZQ1dF3Tv52kWj89X672zn7BIQ1BlRD3hG4XJ-_Rgb5Hh-gTegnombthIuxu4v6AWpOmBOvA4AHdK_w4q-DzqCvg04Tg0EcdXhaO8hkwdEke4gQVKOODi3ljpAGcOWcsDBq5A4EGdyok0cwniCY4rhLoeEJfwIcA2tqvWVMzglZffzaDeFpaC2XPA4lrb0G82qYFeS4MuJFLWcX3c47PZ1WmcoQNaVempqYNmeJ6PiOIOJ6tQODAo885nbL6waHmIeVvbBynAj5iSJl_NfYAagcnG-GQyZBHid_rVPuEMygTLNA3zn1uHHJZrOSGyZ71fIWbqum6hlFWVatpw5E2gmPXIQrW1TtJZdn361236-Sar-uV2tCSsrItm7JjVV0VZb1bi573TDDZ1WVH1iXOXOkicSlNm1UueNOwtuxXmgvUPr-3UGrwsHSD0MTQldtk_ok4erIutfLBv3oJKmjc_P9ly74gld8rpE39rWgH4syloPb6tIpOX74E_Q7tU85_BQAA__-TkOIL">