<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/129615>129615</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Optimize chains of `memcmp`
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Kmeakin
      </td>
    </tr>
</table>

<pre>
    Often in C/C++, since there is no way to `switch` on strings, code will have big if-else chains of (the moral equivalent of) `memcp(s, "some string") == 0`. For example, [LLVM's own lexer](https://github.com/llvm/llvm-project/blob/main/llvm/lib/AsmParser/LLLexer.cpp#L493) compares every identifier against a few hundred different string literals to recognize keywords.

It would be nice if LLVM could optimize such chains into a more efficient method, eg a trie:

```c++
auto src(std::string_view s) {
    if (s == "aaa") return 1;
    if (s == "abb") return 2;
 if (s == "bbb") return 3;
    if (s == "baa") return 4;

    ...

 return 0;
}

auto tgt(std::string_view s) {
    if (s[0] == 'a') {
 if (s[1] == 'a' && s[2] == 'a') return 1;
        if (s[1] == 'b' && s[2] == 'b') return 2;
    }
    if (s[0] == 'b') {
        if (s[1] == 'a' && s[2] == 'a') return 3;
        if (s[1] == 'b' && s[2] == 'b') return 4;
 }
    
    ...

    return 0;
}
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy0VE2P4zYM_TXMhdhAoWMnOfiQmSBA0RTbU6-FJNO2urKUSnKy019fyHEwn5hiDgUUBJAeH_keacoYTeeYaygfoDws5Jh6H-pfB5Y_jFso3zzV39vEDo3DR6DjI9DDdB4xGqcZU8-B0UR0Hq_yCZNHqES8mqR7qAR6hzEF47qYY7RvGK_GWuzlhVGZDk37jW1k1L00LqJvEWibesbBB2mR_x7NRVp2CX0LtMvsAw_6DLSdKIEo-oHnLEA0gYoDFAcUUIklHn1A_imHs-UpoHw4nf74DWgT0V8dWv7JAcoD0LZP6Ryh2AMdgY6dSf2oltoPQEdrL_e_b-fg_2KdgI7KegV0HKRxLzAm3-3j8LsMkQPQ8XQ65SRLfT4DFaf1rshFaj-cZeCIfOHwhKZhl0xrOKDsshcJJbZ8xX50TeAGG9O2HLITN61oTeIgbcymB9a-c-Yfxh_8dPWhiUsQexD7XxJe_WgbVIzOaEbTYtaPerr152SGHBZH3d-bYFzyKHMHGLltjTY568Cp9022kDuUmILh7NWUBSpxO3oeELGXY_IYg86NSk1GFvtb4X9eDF8xTo3aZCgi5rIy8t46IJJSzu0MnMbgcAXFJ2ilXqNpRr-HqrfQ4jNi9baM9Q09ByyXs9H3ZzE_bw63-8mH1KWv-QDlg4Dy8FzHRgJtXkCfcav3OASqgCrMz_QxzXtPX2d_w6o-Z1WvWemZ9ebDp8rUa2X_VcrXBBb_i8B5Bl6oezcOiB9PxP1LWTR10eyKnVxwvdqsV2JdVWux6Oui2AqhqmZVFbQRBTVitdvtNtu2KpsNF3phahJUikKshci_paqKdSmUJF3umi0JWAsepLHLvJGWPnQLE-PI9Yp21apcWKnYxmnnEzm-4vSah7w8LEI97Tg1dhHWwpqY4jNNMsly_f2-M17s7NtWHs5Z2Bhs_eVVOtUQgY5zkZea_g0AAP__0araxA">