<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/108840>108840</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Don't hoist NOT's out of loop that will be folded into ANDN/ORN anyway (except sometimes on Arm)
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Validark
</td>
</tr>
</table>
<pre>
[Godbolt link](https://zig.godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXABx8BBAKoBnTAAUAHpwAMvAFYTStJg1AAvPMFJL6yAngGVG6AMKpaAVxYM9DgDJ4GmADl3ACNMYgkuUgAHVAVCWwZnNw89GLibAV9/IJZQ8K5Iy0xrBKECJmICJPdPQswrDIYyioIswJCwiItyyuqUwp7Wv3bczoKASgtUV2Jkdg5MVRjKgGoqBjXUVAgmEBXXADYAFlIV4L3Do/H945WAUgB2ACE7jQBBFc%2BVgDcKlYV7gBmAAiKyY9wejjOd0BL3eXx%2Bf2ImAIewAAgAVACeUUwAHkqBAFNcYaCNDC4W9Xh8vgB3BB0TArCAEYiuTAk57UhEItAMBQEFaEMJA0EKCncnmfZGCx6OUlCghhCXwqX/e4AJgOCoAflBhcRNS8NQBWZkGoHywFQrjjTlQ/VKw0wq1QjUkrUrHVMO0qmlSvBUZkA0kKjTXYLIpgAaz9CMewOpkulKJmGxlfoTHEmtE4Jt4ng4WlIqE4AC0zP9prMmXcNYCeKRUUXs5NoyATRoAHQATg0Gg1GoeB1NkgeXBNkVzHCOvBYEn7pELxdLHF4ChAGibmmzpDgsBgiBQqBYUUZZAoEDQp/PIGMrNcDGjfDoTo3EHOLdIwT8FSxnEbH9mGILE8WCbRimbRtrzYQQ8QYWh/y/LBglcYBHDEWgN24XgsBYQxgHEZC8GREpvkwbDi0WYpXCVADeD8JVp2LWg8EjP9nCwHcm2IPB5xwyYqAMYAFAANTwTBaTxXFC0bfhBBEMR2C4A4ZEERQVHUL9dA1fQCLvMx9DYjdIEmVAokabCAFo8RWAAlepMCYJQADFnMFKyemAFEwQqZAEGOKzWIYVxVBWKyWGQKJXFJJgoiiegAH0WEBXhUHI4heKwEyIEmIoSjsCAHD6Twh1IHxhhyPIQAeaJYniAQSpAMq0gahg2iqzpavyxpml6Fwama7rHIKppBg6jpwm6wYmrKgUWgm0Ypry6s5gkHM8wLbjVxWUxgBWbsuCOLsNCsrBvi7Cc6yedAjmCZBjg0QFJGZXBCBITUG3GXhmy0O1SAQJysHCXLSHbI4exOnseweB56weI4uA1SQTX0ThZ1Iech17Y4JwOB4e0kLgCaR3TlzSzh103bcW0mfcj2vM96AvShGdve82SfF9aDfShP2LIC/3o79fxAsCIOsYWYMYAh4MQ7iULQjDaCw4W8IIoji3wUibHIyjeGo5BaPmRtGPqbjWPYkDOPmYtWT4%2BjBOEsSJKkmThfk4RRHEFS1PkJQ1G43Qp30vajOCHKzIshJrNshz6GczA3IFcKvJ81RJAORLAuC0Lwsi6LSWMBgMox9Kwiyij4DykbGnsBgnAG/pyobxbqsiVrGiajv6saNuxgsWvShmpvakHhph4WyrJr0eb%2BuSMe56GbIZ9tKYZjWtfp3zJdts4XbK0O47TvOy6TWu277se57XvwIhnXrW0fp3f7AaYYHKA2mc5xAQFu2HSQkgNAPDHAcHsGojgHFRuTEslMLDU1%2BruemEAkBs2ZuQVmJ4madA5o%2BZ8NAeZhHfPzXggsQLCzIaBcCkEpYnlgrLBCSEtaYFQuhTC2FGzqyMJrXCJFIJ4D1txQ2xthZm2YrwS2xAOIYFtj9Xi/EeBOyYCJcSklpKMA9rIRSPtVKew0oHbSv89JGAMuYS2kcSzRwELHeyjlE5CHwirVOFRvKCl4goZA3wc5%2BDzhFKKMUQRxBoORRKqhkYrgypXCxPUEj10bgvCQukKoryWhIE4ncEjd10hkzI09UlHXHvwgQfUqij1nkPYp408nt26C0butTKj93yCtDeykv47xgTtPaB1LrHzOpgC6V0NQ3Tug9I4T0XoQDevfT6T8aZ/S/hjeciMuxageE9E0gIIGSB7COMme81zwK3Ig/64NIYaGhrDeGiNkao2nKlXeX5VzP1pl/DUP8uCLk6XAk5kwMpxDsEcIAA)
This code:
```zig
export fn foo(a: u64, b: u64) u64 {
var s = a | b;
var ret: @TypeOf(s) = 0;
while (true) {
const iter = s;
ret |= iter;
s &= ~((iter +% (iter << 1)) | ((iter << 2) & ~a));
if (s == 0) break;
}
return ret;
}
```
Results in this emit for Zen 4:
```asm
foo:
or rsi, rdi
not rdi
xor eax, eax
.LBB0_1:
lea rdx, [4*rsi]
lea rcx, [rsi + 2*rsi]
or rax, rsi
and rdx, rdi; we could have just used `andn`
or rdx, rcx
andn rsi, rdx, rsi
jne .LBB0_1
ret
```
As you can see, we hoist `not rdi` out of the loop, even though we could have used `andn`. The same situation happens to the Sifive x280 (aggressive unrolling disabled via size-optimized build option):
```asm
foo:
mv a2, a0
li a0, 0
or a1, a1, a2
not a2, a2
.LBB0_1:
slli a3, a1, 2
sh1add a4, a1, a1
and a3, a3, a2; could have used `andn`
or a0, a0, a1
or a3, a3, a4
andn a1, a1, a3
bnez a1, .LBB0_1
ret
```
However, on the Apple M3, it actually does make sense to hoist `mvn` out of the loop in this case, because we can do `and x11, x8, x9, lsl #2` but we can't do `bic x11, x8, x9, lsl #2` (I assume).
Apple M3 emit:
```asm
foo:
mov x8, x0
mov x0, #0
orr x9, x1, x8
mvn x8, x8
.LBB0_1:
orr x0, x9, x0
add x10, x9, x9, lsl #1
and x11, x8, x9, lsl #2
orr x10, x11, x10
bics x9, x9, x10
b.ne .LBB0_1
ret
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyceEuToky39a-hJhlPBSSoMKhBchNEVLzj5IsEUu6XIkHAwfvbv0Cru6uqu59zzkt0QJt77dvaa2OFmNI4LAh5YyYyM1FfcNtEZf12xFkc4Dp98cpgGG3zMvDKrAFZXKTMRGWgGDVNRRkeMVBnoH6Pw9fwiXkt6_BxyN8ZHq3dZRghXX4PJo7SI7fbuLKt5PI2kJdI5xzFR3ijKci-d3JjoxlyOr2xHXlA21CyGsfV0EBjj3XOSO5FWUZWKRd7hA7IqDpk35Du7nfNIuQQum1sfbGcDqgI58f5FNlWhdGxd21JDdWFMM-RmvFIWdlumLaHEBlHdFRO3aVYdaI0X8YeOkoM1M3FxRGtiTmwfS1bmrLITWWxCTYO7WrVdIe4NGm3ULrYbIahfT901ex04z3_XlpoFzaHI1ykQTixj9vECuflu5ycD0cU5pp8PiPVRXpmHoUlr5YM1A1JmJzQIZQhUrRZ4siy7jNwAuVjqPpW5h5noZwjFFuDK3UkSdYBKy95cu4foCs0c2SSDqEQHZFCDocOGem7jPSV72JlpSrCSTpH4i2UeVlm96hWkObG7R6lk9kh0swG2bKj6VizF4jVLKXwi_x6mSv9dH3IlDCKlHP3fmagTgS0CNdKia6ZHPnb1U5cOfLpks5LY-h0p06SfeYwUD9aHdu9c05ySOtDbRztNfueZ7ubfLikiE2tZH92l2YVInyXFJt06yptM2uRz_vsWu4NLW80YdEIRSicMvNdE5DC3jdSVdv52V-5CrG2h8OZTlblgPbbEhkO2ofsPOTmJZG5VRqSs7xKj3dDWd9kVytmlbbsg6WhC0qE5rGX5_dKDakcxrLzLruHw-WCSk11KgbqG3Nu1ytfnSdXtdyXvGasl4HZ7UV91xeaZ0j53Fxq4knxtnHubZv67jqO6LiniIH6wEB9KYeZ78qGqp3CFRTOIZVdpw8N7aKI861W3dLOU2P25FZn5YhUooqLYwWXp1DUko1UKJ3h59DcyJO-G9x3ZLtIR2i17_Yy3vdnXWG9ayTLmr2FioCEi6ZtnaNxWEoB4q6OsrzZvaS6SWBq-RGVKfYUpJfi9oRQRipbcRcOUvXC1q2B5EjXOuf9grT52rqfTPfYH-WTNZzmjrU464tFWMYxCZHBnjT5HB0MU4iI1WmDqeVmuUuoggx1uu8izjT6KBo2poMIXrhFjJxdhex3NsRRCOP3e4VvQ1_l2VTvcI5ntWFaVSWH06Tq8mnn5rUVHk5hzrpVPUzpJEwNW-yWXnLsT4c-lE8ebZX1kq4UWsu3eKb404EEKMmVi5yErKMv5pezrMjm_jDn5-coH9g1t0POYqCGcl7uDM8USL8uKOmIPJl2xBRaxO2c_ZndRxfOJH7EzVxh0a1lgbDpMlTwlt9n99094lje809sfvUTeLOlKbrtonlwIyLcXXUJq9ddtIFLU2Ggzpcz6dojqphr6p5c20WRYDuz-LB0EhUrnXASTbNMYr47xJ5hDgmZlxMZb_JtM9-8e8lp46bqepO7zWkvPBY7kdca3S2sNN9H11TYbg1N33GbdOFwc4F3Kp694WStGNbdpNGiXtHoPp37d8Sbum_dFiK3XaD1dolahV12NJ4udydbDu2yl6wuHuJEUIfUmxdUDgs0Z6BeDaXXe-80PtbzGCe3KcWrtg_p6baL8uqGUyOqIuE01IuaJd6jxHZnk8l07pkXqbHcS3JqrhV_XwQNFc6exa6F2f62Hqb7UwtnM0oms_OtM6PiXJ_Ya8VfZwi7rmEhmPuTHjl6C42dk4YrtFEN5Bt0XiahoW_bvUYze6keuEfSuiW5hlKZwkvrHPOFkOMATcqH7SKuELlExnVzP4chdZbK3cSyr6Ra5RZZWC9lZac12NWdNtorUJ_f3wd7UZ-VhW4uBJVregf2TXTJoVt3O9ijtenqTSKdYwbqGpHXg6ss_F2aVZaNpNrc7jbN0SIdWxre7iZKizmyW3cHU39333ba0iBDkmkTxeji-rjWtzdLyYSwvCC-8yePitV7pFuWfXC2vVyut1vrPUrdOhyqs9JvDlrCsV54U5w2tcp6OEWmJhS-lgQcG6V3k5UOmXE2bl3oXA_vccIV6catBFYsGjhVWK9t9lYi8YPSqGRIb8KsD_fNBsvcsjbu6ypUpkd25fD7NpRMYc-eS0clt-t-uhfxhYH6joF6lBA_ttseh0STNDY0zflOnilrOycid-8s3hy_56fCynTxPKpVMo9XKS5hYWVncp4c7xuuyhioq4eN2CrLdHpG6STt7KpXqeabCDFQYliVYdHzvo9iCvwyIONfFZ_OmSn7_HePw-cJ6auybsC1ANeyZKCIGR6BdiowUAHez_9L4wMwM_npBAAAN1wDChheBRgwswf4m7UmzRiAEdj9UJH1lYEiHUONPuxP9C-fLoozAhgoNnVLHsDP-cbLLwvagLgh9SMI_ZJyvGrSjMWMxhH1m50CBk5H638YKDJQfIaCMgMn4OdHXmF4BXAjpY8iFPAZ_LTChwlOwX_wE_dbqvg6uj0YejYMJeDVBKdfkMxM_U5DTZq2Lp7s_eDoJ-rH_D47bQlts4aCuADNOHeSxw24ljW4kAIIf1MApvnzZJw7j74WX9YffNJ4FEIdxF_tRdn8fth_eBHcj07j42F_Xcoy-_-435JkBD-TBA88M5EFBqIx5UT9C9L_gaxpPM5tnMMfPX428KxlhDztuAg-Jx2b4GXQEeCXbRaACN8ISFragJaSAIw8FUHxk_BP4X8E8PuvphH_hbpvBXxcSUEezx_kPI3j0P8-aETBULbAxwWghIxhOwKiMqbNWOiPmUxZULYNKK-giQjIyrJ6TONGRnmUbRh96_Zbo69gHxFAcU4AjZsWN3FZgAhXFSkoaMpH0F18jW8E9FBkR5HjMKwJpeNRW9RllsVFCIKYYi8jAbjFGND4Tv4pqybO4zsJgNfGWQDGz2Xx2J3_QqL57YNvOLaHvw0oi8cjqIDfB_f04h5ezzv8XdyfIsO_qJhmWfyA8b9CfYtEIw4HAQBY-JSN-00v4FMY_iMnL__LjP7SE_vBxB-ylPXXBMKfRfuFFv4rxivI_Rfm_6Jbo-zIjdSjW1k8BISqKiPAftQSNwD7TYuzbABBSSjIcUoAJQUlo95-6ju_FX_Q9s-3no_pYyU84uOWkofIcQGC8oM30HOPwnvxcZfGe0YzwEAejnG9tvnwYeCs-fDzYv9_8mOgaAJMaZuP31mvX_b1o8vHG_m_E3n5VPlHcvYv1sfIGcj_JoynMp5F9z_a-LZGxY_o4r-_rn9GY38R8b2ih9pHDPcZ9Jmzv6j_30j-cxUfGT78uG-FeLFPP7Uu_Rn0-nwJ_--0_BK88YHES_iFvHEzOBUkSZyKL9EbnorcFQp4csVYnEhiMJE8H7OiF3izKxb4l_gNslBgJW7KTbiZIL5O-Kno8cKMm14DGEx8RmBJjuPsNctu-WtZhy8xpS1541hRFNiXDHsko4_flSAsSAceVgZCZqK-1G-j0z9eG1JGYLOYNvRXmCZuMvKmlk9NP1dptd4zcEZ_LNJjiZoIN6CLswx4BFzLLCABiIumBGilrhior7crgIuhw8Ood9L7pGoALXPSxDmh41ajOmeg9NLW2dvXn7PCuIla79UvcwbqY2Efj3-qukyI3zBQf7RDGah_9Ht7g_8_AAD__7dAoyk">