<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/56419>56419</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
clang 15 instruction count regression with `-O2`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
firewave
</td>
</tr>
</table>
<pre>
Compared to 14.0.6 I am seeing a slight regression in `Ir count` when running https://github.com/danmar/simplecpp with `valgrind --tool=callgrind`. I am aware that a higher instruction count does not automatically lead to decreased performance.
I was able to find a function affected by it (I removed some parameters and error handling code so it compiles standalone). With clang 14.0.6 I get a count of `710` per call and with clang 15 it is `721`.
```cpp
#include <string>
#include <istream>
static unsigned char readChar(std::istream &istr, unsigned int bom)
{
unsigned char ch = static_cast<unsigned char>(istr.get());
// For UTF-16 encoded files the BOM is 0xfeff/0xfffe. If the
// character is non-ASCII character then replace it with 0xff
if (bom == 0xfeff || bom == 0xfffe) {
const unsigned char ch2 = static_cast<unsigned char>(istr.get());
const int ch16 = (bom == 0xfeff) ? (ch<<8 | ch2) : (ch2<<8 | ch);
ch = static_cast<unsigned char>(((ch16 >= 0x80) ? 0xff : ch16));
}
// Handling of newlines..
if (ch == '\r') {
ch = '\n';
if (bom == 0 && static_cast<char>(istr.peek()) == '\n')
(void)istr.get();
else if (bom == 0xfeff || bom == 0xfffe) {
int c1 = istr.get();
int c2 = istr.get();
int ch16 = (bom == 0xfeff) ? (c1<<8 | c2) : (c2<<8 | c1);
if (ch16 != '\n') {
istr.unget();
istr.unget();
}
}
}
return ch;
}
std::string readUntil(std::istream &istr, const char start, const char end, unsigned int bom)
{
std::string ret;
ret += start;
bool backslash = false;
char ch = 0;
while (ch != end && ch != '\r' && ch != '\n' && istr.good()) {
ch = readChar(istr, bom);
if (backslash && ch == '\n') {
ch = 0;
backslash = false;
continue;
}
backslash = false;
ret += ch;
if (ch == '\\') {
bool update_ch = false;
char next = 0;
do {
next = readChar(istr, bom);
if (next == '\r' || next == '\n') {
ret.erase(ret.size()-1U);
backslash = (next == '\r');
update_ch = false;
} else if (next == '\\')
update_ch = !update_ch;
ret += next;
} while (next == '\\');
if (update_ch)
ch = next;
}
}
if (!istr.good() || ch != end) {
return "";
}
return ret;
}
```
https://godbolt.org/z/vKxPvPK7K
The generated code is quite different in parts and since I have no clue about assembler I cannot tell if that is a good or a bad thing. At the first glance it seems like there are more `4-byte Spill` and related occurrences.
There's already differences in the generated code at `-O1`. The code at `-O0` is identical.
With clang 15 there's also this additional code at the end:
```
DW.ref.__gxx_personality_v0:
.quad __gxx_personality_v0
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJylWNtu2zgQ_RrlhbAgS46dPPghlwYbFIsW6BZ9DGhpZHErky5J2Um_fs9Ivkiy4nhRl3UkDTlzzlwojhcme5s_mNVaWsqEN2I8CaNwKp6FXAlHpPRSSOFKtSy8sLS05JwyWigtgmn0bEVqKu1xKbYFaWErrXlJ4f3aBcldED9hLJUvqkWYmhVuMqlX0uLCqdW6pHS9FlvIWd1GlkurdCZGI29MGSSPqSybR5CGDSi5BVThC-kBrAAuskDjvK1Sz8hqQCIz5IQ2mFN5s5JesaY3UZKsWWaUWpIOlNdkc2NXUqcUBtFjEN01389iK52Qi5J4fs6opMgr3RiReU6px_LFm1BeBPHNM7yzMhs8cmZFAv6UK_JkoQNLyVpjRYHLkt2Tmowwj5fCKWtVAqzzkMrSaAri21D8YJ-kpcTsQ0iWxJwbgiZnj83GEfseJAQTrG1tWyuv2YRy9dR4zE5sc8R9MxCE3ZM4UTotK8ALkgc4FWiD5NOQUEFKcnWU1t9gAV-LSju11HBGWkgLz8jsoeCg3zifcVokd7vlcN2UL4P44bhIgd-Ck-V2p3p2v7dx29WcIm-SR9FYfUml80DWmcL44hs2EcJ9uGStGMl9xxPRbZOq4glx-v7P02g8FaQ5ThmCz_HxBYn7L3-zN6PXnPIcs3GBREBi5izuq2LzEllieY02enT37eH5ufXY1yVD61KmxIGqI8c6D5pUzrkFXzBNZtqYFsHsAUN0BYACZqLtLYzUoDhE323xn_vtqJzjlRbwGOscwlvDSp5YlhawhHHDHBhII7trZHFXOGDwwoA3YwfqU4PkJtrjYGfVRnnCALHZ43By_LUvYFSfpi0uyYVhL1oNxMYVs-D6wfKfgbgU4jhH858u09PQc61g9Mj3grUm-nmIVheIboC0jdTUbjZGZRD0gt2FQ6VDjv55OjIzTpdxzf6syf3U-PKplybhuJNnnRzspuB42NYu0GwN22rfwwOceRETqPQZCpdNOibnye1J3lryldVcSQc93TmH_bjZ6-ut-rv2qvxgr24Kv95MkI7W954R3tgX7egCn1MM_gCX5biH6ftd3Vvf27t5ygLHBbGQ6U9XStcUVi6RsR097TdG1JFsC2zx-9ptAgoG-3o7PjwW9Hsy3ZI1GWtM1qrHNu8GUw2n9Ybce3jnrqS3YFeCR6pHHAO1fmqwZTQ6UV678gMvHpQYpImuToWHDPs_GlsxbiVrl3SfI48zLOucqNaZ9PSSfsiGc0PTqz_jmcwMG-LPYenlkeyS2yvop1mzs55Kzwa45dWQLM65MMDXTv2mJhlH4-_nMJ2G7V2IH6m5NAL8QeqI1ovm1OAh5JfaQ3EeHrxrtpV8bHJwHkM7bBPngA0tbugcgbyHfwd6EESnrHq7-NEG-Pa2nX0Kdba2wczZvSyCOObRtj9gbje5vVcfZ-37ivaiXktosoUpfWjsEne_8X_z-fXr5uvn2ef2on9w6F6SRg5zs1W3TThL_6qUJ5EpnDAsaT5_cr_lm1bLoUchNEuF3BBO3eiEKkIfZyr0Ts7RCh2dhTiVmttDT-iaVN70k1AtBXtOoAmQKAD0igXeSKG48_X5P1cWb7hlyd0iH9jRIK-cKNVPbkiBRXBrujL4AvnJaPEGmN_Q3pXcpDE4S2XNxKRpZYE9xfmxx9eiQmcAUvJO8nZgiZnM0586BMChfvSlbu8Ee6zzuG4QwUxlcBW3wR2DPzrNom-ZR3sK8rjKMsU9rywPehkEJ1Fy11bVi_rjj9BSHr68LF9fX9CgOlah_NvLJjos3Gde-KuCr4UYnNxVfpXNk-w2uZVXXvmS5sdG9-Q3gNbPFftfGEZfYtZR2XJ-5ieKstzs_4zW1vyLVh-3yrmKHC6up5Px7VUxjyeUReMkixK5kElCM4qzZDLOZHwt82ySXZVygd1sHlzfo6DQL4haBRfX9eOVmsdRHEcz_LuZRJNpKKezaBol6SIlops0CiYRraQqQ8bBhXJl5zWkRbV0EJYodHcUIrn5rEW1OeiXlS-MnSNlaYtSuKptz2vs_wHmofYP">