<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/142042>142042</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Performance: LLVM 20 aggressively optimizes popcnt and results in worse performance
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
jabraham17
</td>
</tr>
</table>
<pre>
I am finding that the same microbenchmark, when compiled with clang 20, is much slower then the same benchmark compiled with clang 19.
[This link](https://godbolt.org/z/5Md1dG9YG) has the full benchmark and the assembly for clang 19 and 20.
The LLVM 20 code is much longer, and seems to be because the LLVM 20 version is vectorized and not using the `popcnt` instruction. For some reason, this is slower. The LLVM 20 version takes .15s, the LLVM 19 version takes .05s.
Using the naive popcnt for the C version does seem to get pattern matched better and result in LLVM 20 being just as fast, if not faster
```
uint64_t c = 0;
while (n) {
n &= (n - 1);
c++;
}
return c;
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJxsU01r5DgQ_TXypUgjlz-6ffAhmdDDwgzsIbuwp0WWy7YSWWpUcjfJr1_k9jjLECGwVV_vvSpJMZvREbWiehLVc6aWOPnQvqouqEnN-THrfP_e_gFqhsG43rgR4qQixImA1UwwGx18R05PswpvAr_BbSIH2s8XY6mHm4kTaKvcCCiT2zDMi56Arb9RSIXcZ7W90JcF8uYg5GPa1dPLZBiscW-iehZ4mmK8sCgeBZ4Fnkffd97Ggw-jwPOHwHP1s8_7780_3wU2MCleIYfF2v9BKtevZsVMc2ffYfBhR169KDcCLxPBjx9__wSUoH1Puyrr3Ugh6UzxTDQzRA9dkqbVwrQi_Eq9UmDjXcq-ko4-mA_q10znIyx8bzeBqOXFX7SLopZgHMew6Gi8O8DZB2A_EwRS7F0CjqkzhrcGH-DlC8Co3ojhkFd8z9gi8ub3CFnxJvmvnY1T5kpwJ7T2KBm_7Zm9J16VJ-EjRbioGCk4mFXUE_XQUTqvMgPxYiMYtzPsKMG8LhxBMQyK43pphrUj6UhhuwO13LZ8BIDFuFiX_0bQIIpnkKJ4ujtuk7EEAk8ujV4cN_O6HAisU3hywwPkAps9cV1a4FPau1Ecn7e_QHEJDvTdt9PJ-rbom6JRGbX5sTxVVSUlZlOrSlXIQh_zoe6KOq9PJ2qkrJt6yJumbPLMtCixkhU2mJcF4gGxL-uqLE40YJX3uSglzcrYg7XXOV3uzDAv1OYlyhIzqzqyvD5lREc3WL0CMb3s0Kakh24ZWZTSGo78WSaaaKn9k8Lgw6ycJlE87hNR4xiI2VzJvoO_RDObD-Jf8_-cIqcx3nxggstnpWwJtv3tdZo4Ld1B-1ngOXHYPg-X4F9JR4HnlTkLPG_Sri3-FwAA__8TQHtN">