<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/56419>56419</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            clang 15 instruction count regression with `-O2`
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          firewave
      </td>
    </tr>
</table>

<pre>
    Compared to 14.0.6 I am seeing a slight regression in `Ir count` when running https://github.com/danmar/simplecpp with `valgrind --tool=callgrind`. I am aware that a higher instruction count does not automatically lead to decreased performance.

I was able to find a function affected by it (I removed some parameters and error handling code so it compiles standalone). With clang 14.0.6 I get a count of `710` per call and with clang 15 it is `721`.

```cpp
#include <string>
#include <istream>

static unsigned char readChar(std::istream &istr, unsigned int bom)
{
        unsigned char ch = static_cast<unsigned char>(istr.get());

        // For UTF-16 encoded files the BOM is 0xfeff/0xfffe. If the
        // character is non-ASCII character then replace it with 0xff
        if (bom == 0xfeff || bom == 0xfffe) {
                const unsigned char ch2 = static_cast<unsigned char>(istr.get());
                const int ch16 = (bom == 0xfeff) ? (ch<<8 | ch2) : (ch2<<8 | ch);
                ch = static_cast<unsigned char>(((ch16 >= 0x80) ? 0xff : ch16));
        }

        // Handling of newlines..
        if (ch == '\r') {
                ch = '\n';
                if (bom == 0 && static_cast<char>(istr.peek()) == '\n')
                        (void)istr.get();
                else if (bom == 0xfeff || bom == 0xfffe) {
                        int c1 = istr.get();
                        int c2 = istr.get();
                        int ch16 = (bom == 0xfeff) ? (c1<<8 | c2) : (c2<<8 | c1);
                        if (ch16 != '\n') {
                                istr.unget();
                                istr.unget();
                        }
                }
        }

        return ch;
}

std::string readUntil(std::istream &istr, const char start, const char end, unsigned int bom)
{
    std::string ret;
    ret += start;

    bool backslash = false;
    char ch = 0;
    while (ch != end && ch != '\r' && ch != '\n' && istr.good()) {
        ch = readChar(istr, bom);
        if (backslash && ch == '\n') {
            ch = 0;
            backslash = false;
            continue;
        }
        backslash = false;
        ret += ch;
        if (ch == '\\') {
            bool update_ch = false;
            char next = 0;
            do {
                next = readChar(istr, bom);
                if (next == '\r' || next == '\n') {
                    ret.erase(ret.size()-1U);
                    backslash = (next == '\r');
                    update_ch = false;
                } else if (next == '\\')
                    update_ch = !update_ch;
                ret += next;
            } while (next == '\\');
            if (update_ch)
                ch = next;
        }
    }

    if (!istr.good() || ch != end) {
        return "";
    }

    return ret;
}
```

https://godbolt.org/z/vKxPvPK7K

The generated code is quite different in parts and since I have no clue about assembler I cannot tell if that is a good or a bad thing. At the first glance it seems like there are more `4-byte Spill` and related occurrences.

There's already differences in the generated code at `-O1`. The code at `-O0` is identical.

With clang 15 there's also this additional code at the end:

```
DW.ref.__gxx_personality_v0:
        .quad   __gxx_personality_v0
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJylWNtu2zgQ_RrlhbAgS46dPPghlwYbFIsW6BZ9DGhpZHErky5J2Um_fs9Ivkiy4nhRl3UkDTlzzlwojhcme5s_mNVaWsqEN2I8CaNwKp6FXAlHpPRSSOFKtSy8sLS05JwyWigtgmn0bEVqKu1xKbYFaWErrXlJ4f3aBcldED9hLJUvqkWYmhVuMqlX0uLCqdW6pHS9FlvIWd1GlkurdCZGI29MGSSPqSybR5CGDSi5BVThC-kBrAAuskDjvK1Sz8hqQCIz5IQ2mFN5s5JesaY3UZKsWWaUWpIOlNdkc2NXUqcUBtFjEN01389iK52Qi5J4fs6opMgr3RiReU6px_LFm1BeBPHNM7yzMhs8cmZFAv6UK_JkoQNLyVpjRYHLkt2Tmowwj5fCKWtVAqzzkMrSaAri21D8YJ-kpcTsQ0iWxJwbgiZnj83GEfseJAQTrG1tWyuv2YRy9dR4zE5sc8R9MxCE3ZM4UTotK8ALkgc4FWiD5NOQUEFKcnWU1t9gAV-LSju11HBGWkgLz8jsoeCg3zifcVokd7vlcN2UL4P44bhIgd-Ck-V2p3p2v7dx29WcIm-SR9FYfUml80DWmcL44hs2EcJ9uGStGMl9xxPRbZOq4glx-v7P02g8FaQ5ThmCz_HxBYn7L3-zN6PXnPIcs3GBREBi5izuq2LzEllieY02enT37eH5ufXY1yVD61KmxIGqI8c6D5pUzrkFXzBNZtqYFsHsAUN0BYACZqLtLYzUoDhE323xn_vtqJzjlRbwGOscwlvDSp5YlhawhHHDHBhII7trZHFXOGDwwoA3YwfqU4PkJtrjYGfVRnnCALHZ43By_LUvYFSfpi0uyYVhL1oNxMYVs-D6wfKfgbgU4jhH858u09PQc61g9Mj3grUm-nmIVheIboC0jdTUbjZGZRD0gt2FQ6VDjv55OjIzTpdxzf6syf3U-PKplybhuJNnnRzspuB42NYu0GwN22rfwwOceRETqPQZCpdNOibnye1J3lryldVcSQc93TmH_bjZ6-ut-rv2qvxgr24Kv95MkI7W954R3tgX7egCn1MM_gCX5biH6ftd3Vvf27t5ygLHBbGQ6U9XStcUVi6RsR097TdG1JFsC2zx-9ptAgoG-3o7PjwW9Hsy3ZI1GWtM1qrHNu8GUw2n9Ybce3jnrqS3YFeCR6pHHAO1fmqwZTQ6UV678gMvHpQYpImuToWHDPs_GlsxbiVrl3SfI48zLOucqNaZ9PSSfsiGc0PTqz_jmcwMG-LPYenlkeyS2yvop1mzs55Kzwa45dWQLM65MMDXTv2mJhlH4-_nMJ2G7V2IH6m5NAL8QeqI1ovm1OAh5JfaQ3EeHrxrtpV8bHJwHkM7bBPngA0tbugcgbyHfwd6EESnrHq7-NEG-Pa2nX0Kdba2wczZvSyCOObRtj9gbje5vVcfZ-37ivaiXktosoUpfWjsEne_8X_z-fXr5uvn2ef2on9w6F6SRg5zs1W3TThL_6qUJ5EpnDAsaT5_cr_lm1bLoUchNEuF3BBO3eiEKkIfZyr0Ts7RCh2dhTiVmttDT-iaVN70k1AtBXtOoAmQKAD0igXeSKG48_X5P1cWb7hlyd0iH9jRIK-cKNVPbkiBRXBrujL4AvnJaPEGmN_Q3pXcpDE4S2XNxKRpZYE9xfmxx9eiQmcAUvJO8nZgiZnM0586BMChfvSlbu8Ee6zzuG4QwUxlcBW3wR2DPzrNom-ZR3sK8rjKMsU9rywPehkEJ1Fy11bVi_rjj9BSHr68LF9fX9CgOlah_NvLJjos3Gde-KuCr4UYnNxVfpXNk-w2uZVXXvmS5sdG9-Q3gNbPFftfGEZfYtZR2XJ-5ieKstzs_4zW1vyLVh-3yrmKHC6up5Px7VUxjyeUReMkixK5kElCM4qzZDLOZHwt82ySXZVygd1sHlzfo6DQL4haBRfX9eOVmsdRHEcz_LuZRJNpKKezaBol6SIlops0CiYRraQqQ8bBhXJl5zWkRbV0EJYodHcUIrn5rEW1OeiXlS-MnSNlaYtSuKptz2vs_wHmofYP">