<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/79823>79823</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            popcount 8-bit optimization opportunity (without HW instructions)
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Explorer09
      </td>
    </tr>
</table>

<pre>
    When compiling for x86-64 processor without POPCNT instruction, Clang generates inline code for `__builtin_popcount`. For counting bits of 8-bit only input, I have a smaller version than what Clang could generate. Are LLVM/Clang developers interested in incorporating this?

```c
#include <stdint.h>

unsigned int popCount8(uint8_t x) {
    return (unsigned int)__builtin_popcount(x);
}
unsigned int popCount8_b(uint8_t x) {
 // The algorithm is specific to 8-bit input, and avoids using 64-bit register for code size.
    uint32_t n = (uint32_t)(x * 0x08040201U);
 n = (uint32_t)(((n >> 3) & 0x11111111U) * 0x11111111U) >> 28;
    return n;
}
```

Clang's assembly output (`-Os -mno-popcnt`)
```
popCount8:
 mov    eax,edi
 shr    al,1
 and al,0x55
 sub    dil,al
 mov    eax,edi
 and    al,0x33
 shr dil,0x2
 and    dil,0x33
 add    dil,al
 mov    eax,edi
 shr al,0x4
 add    al,dil
 and    al,0xf
 movzx  eax,al
 ret
popCount8_b:
 imul   eax,edi,0x8040201
 shr    eax,0x3
 and eax,0x11111111
 imul   eax,eax,0x11111111
 shr    eax,0x1c
 ret
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyMVV2PmzgU_TXOy1UiY0OABx6mmYl2pe62D-32MTJwE7wyNvJHhplfvzKBJDOdVhuhRBzuPefcj2DhnDxpxIpkn0j2uBLBd8ZWT-OgjEVLy1Vt2pfqR4caGtMPUkl9gqOxMBbb9TaFwZoGnTMWnqXvTPDw9cvX3d_fQGrnbWi8NJqwHeyU0Cc4oUYrPDqQWkmN0JgWJzqypYdDHaTyUh8GMzQmaE-2dAN7Y2G6i8q19A7MEYp1LT0YrV5A6iH4KPEndOKMIMD1Qim0cEbrpNHgO6HhuRN-dtGYoNqrlw08WITPn__5i7D9JaDFMyozoI0-PVp0HluQGqRujB2MFZMZ30lH-J7QR0If5u8tvVzNfM-41I0KLQLhO-dbqf2mI_zpPinoaQhRwcNghl2stiCsCFL74uBhJKwEkn-6hAMAWPTBaogxd8mElR80kRUxn_A5neSPv5M91L8WJmxP2B6-dQhCnYyVvutBOnADNvIoG_Bmnsx1KEK3IM5Gtg6Ci03bplOAxZN0Hu00_GkLnHzFza3CaIGzgwcNhD_CbCoisRhWjEDYA9CRFjSljCbf72v8VdLlig-fCH8CPpXHtkDHZP58v0APP0GXDFbcNG5j0D8397oI94Oetouw3IFwDvtavYAJfgg-OiVbuv7iYN1rs46zm9Y_2v6I77YlfOaG3pyjJRQjYTts5Qy7zkZYKMJ2yYxNQ4kAHbNsiQt1jGtlxIX6PWkkWEjpyPmd1oWAjuxt6AJfY0Xb_n-9SDyLpW_zJzSSfGTseGN9HRfWq5ZF_66Zh_rWTtkH9cZJJJx37W1rLyF05HceFmzZoQ9JP455x5o07-1eV2HVVrwteSlWWCU53WY85TxddVWJWBzLY5lxigXPszzHrK0bLDFPMMdkJStGWUoTViY8KbJ0Q7O2FTlnnJV1cjwWJKXYC6k2Sp37jbGnlXQuYJWXBeMrJWpUbjozGNP4DNNDwlg8QmwVc9Z1ODmSUiWddzcWL73Cank1La_xwctevop4VoAZBmN90NK_xH_Fcqr88eP-RHGElatgVdV5P7g4tOnNdJK-C_WmMT1h-6g5_6wHa_7FxhO2n5w6wvZTJf8FAAD__9pKE7Y">