<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/64922>64922</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
clang[15/17] memory optimization to a constant wide string leads to return incorrect length in -O2 option.
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
NiharNihar
</td>
</tr>
</table>
<pre>
When we have a static const object having constant wide string. Lets, we have a wchar_t* pointer(wstring_ptr) point to start of const string. if we calculate length oftring at wcslen(wstring_ptr + 1). The length of string will differ with clang -O2 [returning wrong value] where as -O0[return correct value].
If below test programe compile with:
clang15 -O2 option : due to memory optimization, wide char string length returning wrong length[ lesser than 4 to actual length].
where as
clang15 -O0 option : it is returning correct wide string length.
both clang15/17 -O2 and -O0 option lead to different string length.
#include <stdio.h>
#include <wchar.h>
typedef const wchar_t* LPCWSTR;
typedef enum nihar_types_t {
NIHAR_TYPE_1 = 1,
NIHAR_TYPE_2 = 2,
NIHAR_TYPE_3 = 3,
NIHAR_TYPE_4 = 4,
NIHAR_TYPE_5 = 5,
NIHAR_TYPE_6 = 6,
NIHAR_TYPE_7 = 7,
NIHAR_TYPE_8 = 8,
NIHAR_TYPE_9 = 9,
NIHAR_TYPE_10 = 10,
NIHAR_TYPE_11 = 11,
NIHAR_TYPE_12 = 12,
NIHARE_TYPE_INVALID = 0,
} nihar_types_t;
typedef struct g_name_t {
nihar_types_t objtype;
int dirtype;
LPCWSTR name;
} o97_name_t;
static const g_name_t on[] = {
{NIHAR_TYPE_1, 2, L"/nihar/test/dir1"},
{NIHAR_TYPE_2, 4, L"/nihar/test/dir1/dir3"},
{NIHAR_TYPE_3, 6, L"/nihar/test/dir1/dir3"},
{NIHAR_TYPE_4, 5, L"/nihar/test/dir5/dir4/dir1"},
{NIHAR_TYPE_5, 1, L"/nihar/dir1"}
};
int fun1(LPCWSTR ke)
{
int dirtype;
int errcode;
LPCWSTR tmp = ke;
int lt;
// word97
int objtype = 3;
dirtype = 1;
lt = (int)wcslen(tmp);
for (int i = 0; i < sizeof(on) / sizeof(on[0]); i++) {
int matched = 0;
LPCWSTR mask = on[i].name;
int lm;
lm = (int)wcslen(mask + 1);
if (lm > lt) {
continue;
}
if (wcsncmp(mask + 1, tmp + lt - lm, lm) == 0) {
matched = 1;
}
if (matched) {
if (wcscmp(ke, L"/nihar/dir1") == 0) {
objtype = NIHAR_TYPE_6;
dirtype = 0;
} else {
objtype = on[i].objtype;
dirtype = on[i].dirtype;
}
break;
}
}
return dirtype;
}
int main()
{
int ret = fun1(L"/nihar/dir1");
printf("Out : %d", ret);
return 0;
}
Analysis given below:
===================================================================
Compiler has optimized the memory usage and tried to store "/nihar/dir1" in smaller area[~12 bytes] with assumption the sting will never changes in its lifetime. During access of string, it has no problem as compiler knew what it has done.
This a complier bug.
The bug reflects here when we did
LPCWSTR mask = on[i].name;
And we pass mask + 1 to wcslen
lm = (int)wcslen(mask + 1);
As sizeof(wchat_t) = 4
The pointer (mask + 1) has moved 4 bytes ahead in constant character array and asked wcslen() to calculated length from there.
If we see the assembly code of the test program[That I have simulate our code], the general-purpose register comparing wrong [length of string return from wcslen] is always < 4 as it moved 4 bytes ahead in a character string[all due to optimization]
This is clang15 memory optimization bug, same also exist in clang17.
The test program in image, you can see clearly there are two different result [1 for -O2 and 0 for -O0] clang build and run.
I have attached both test program, assmble code and result.
[clangBug_programe.txt](https://github.com/llvm/llvm-project/files/12416473/clangBug_programe.txt)
![clang_memory_optimization_bug](https://github.com/llvm/llvm-project/assets/13232343/7d3ba7a6-867c-4809-8b42-ab3f66d860b5)
===================================================
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzcWF9v47gR_zTMyyCGTUn-8-CH2N6gARa7h2vQQ58MShpJ3FCkQVLx-R762YuhJFt2rHR7KFDghIWc1Qx_85dDzgjnZKkR1yzZsGT3IBpfGbv-Jithw-shNflpzaY7Nn36rUINR4RKvCMIcF54mUFmtPNg0h-YeSJJXbbfhPZwlDmC81bqcgJf0TvGtwOIY1YJu_eMP8HBSO3RMr48tvz7g7eMr1oCeEPyrAdTdBJ7VFkQYCZU1ijhERTq0ldgikAH4eGYOYX6GhkY38CM8dUEXqvBog4WjlIpyGVRoIWj9BVkSugSHr9zYMnGom-sDnzW6BLehWqQJTs4VmgRhAN4_D49M0JmrCX39HwTaD0K3e9LASkqcwSPzsPBmtKKGiEz9UEqDAqw6KlbEzSZJUEXc_DSaGDRE-QNkpdqrI09BUIt_xBEJp-3kSB39xZ2JsOtLe13lmxAoXNowVdCQ0zYIvONUGeO3aTVqDe6t-mi4HSooPQg3UBc75RBjnTQHW77Tk3v_VnC-PNsEQwXOh_iKxQ5adhGDLX_BLB780jqTDU5Aou2zufSTCoWfblHDWk6pIa3Px0wxz4dB6kMX3_Z_vb3119ZtLnHj7qpQcvAfjqg23tgi44Tvr387enX_es_f_mynwGLdpSj296xMCTzQOZj5CiQowt5QIsDLR5bmgRyMkaeB_L8LvIi0BZjS5eBvBwjrwJ5dRd5Nm3dMR1bPOv8NeqwWeuxGb_B_9KSX7794-nryy7wXISwxe46ViNBdd42mYdyr0WNVxEFuAm2SX_Qn2cg4qASl0v74XuXSUCoF8mLHZjVohN1o1D7vqrNZ6WMbst8sPGiIVtshm6ickE-gq-Mc8afg_aMP1NtYvw5l3ZGhMWOeM517AojLI__Ewb9RBeoESxK4ZBt_wOsoFLyKVbS_sQfLB2BDGizO5CD1X3U7oaqfVMCFI2eMb7sY_6GjK_6tX2siI-ee8nSJxI9aG1mcvxEYr-iF-frQ8iLt_ug6jbReiLjz4w_w9HYfLX4uK7L9q4a3SJ3ZnSV7paqfJupfCm1Z3x1Psh9fSDf3ChUGNvxguy2cbQJf27ByT_QFIwv6ThckdJXn5LNlCW7FhMk45vwb3W9jYdm1cJnFeZnMR-5er_Wwr0FtiBG0rF5tZtvkVU94ujgkXrEI62U7k7zCYAsaHGA-UIxHbORnsxoL3VzX9VLXo_IOGZOZxSnK922bZ7xDQX3kYzl2_AOtb-rvZ_oNPT7x4T5KcU6iE_FnE1oLaCtOLrBf1Z1eoa7YXik3rWkf4ab5H6udXYDKoc_r8AlIT8cSUORF7axovPB9beE1KJ4uywbCdHV8u7mfCvyZmm7FSXtgDvFsttRFts6ci6wY3G8MuxgpfZFAObfGx-usIwneWDdEujtik7l6ZiyT1qok5MOSvmOur3xn6_1bQr9Ff61_ti2vYuFSri-G8EcfIV9i9I4UWK4yHsrMW9bPGMR7oYHpAZXC0WQwqJgyeZfMw7pyaMLnRc1acK5pm47AhLk_LmV0_iOlvofXaIjLOkdKFmglzVOYNe0zWKWoXOXNpACLX0wQRvqy1KFNbV3WW_dm8YjHCvhe77caOzajddKOhCBV0m0kDblVSNCjWfalGCxUJh5B6GPOnZddi7zlu1nD5InndO6g3AOLiWX3NqdEoHrT50gT-5yWlKj4_e-K3sQX2zpWni4BQt-qc075hC3AQNRUcMm9WVSQO2TyHyIrhWnkBfCvWF-ad8JyptLq5_3PWxhTU0Bt3jl35cwGXCIIRmEc1in6gR0L6IQ08dhv82SzSsF8qWdTjhZtwMF09iwJlwQtmFZiRqtUI-Hxh6MQ7BYSke6U6yFvTTTLNl8GC10ZSIo3dmW7Kg1FuooTi5cV2JKMunHvCYG7uoyNdkIpfohwFX3n-yuk046EtY36XfmBZSUZKkTNYJQzgD-Lp0P4QqrFh_SeOjHsLtqUYYz82QayIQOUcgUCqtObaRoE4M_Dlt2i66hK1-ymYW7XN_nT7v_0RWtm8OkjVR5INpGXwe9my15L8JFIYwPrsLMt5QLdaqwTYWAEkT3YxmWbIKYTVPu-2HMxP_uQwYsK-8Pjup2uPiW0ldNOslMzfizUu_9z-PBmh-YUVdRSIWO8ecZj2fzeBEx_nwf_nyCdTOIWa_Ivo3SfhilPUXpzylEW8EHjSIe8SgmjRZ5lIqFmD8u54vsMV5OV4_LNOaPIo2K-Txfzqdpcqvg__2o-fQIGr4f8nWUr6KVeMD1bL7ii8Wcx_OHaj0TPE_zPMtynCbRKk5XAldimaaxmMcp8ge55lMeTZc8mi6iBZ9P5nkxL6KomEbzZRLlBYunWAupJuTkibHlg3SuwfU8XnH-oESKyoXxKud0VgQinXDJ7sGuQ2DSpnQsnirpvLugeOkVrkP4aUe0sy_aAfd2rDfhoPk4dQ2TMUf0ruxI3c_dusIk9WCSOHlorFr_1wkVjKKECkb_OwAA___m2lp9">