<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/64922>64922</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            clang[15/17] memory optimization to a constant wide string leads to return incorrect length in -O2 option.
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          NiharNihar
      </td>
    </tr>
</table>

<pre>
    
When we have a static const object having constant wide string. Lets, we have a wchar_t* pointer(wstring_ptr) point to start of const string. if we calculate length oftring at wcslen(wstring_ptr + 1). The length of string will differ with clang -O2 [returning wrong value] where as  -O0[return correct value]. 
  
If below test programe compile with:
 clang15 -O2 option : due to memory optimization,  wide char string length  returning wrong length[ lesser than 4 to actual length].
where as
  clang15 -O0 option : it is returning correct wide string length.

both clang15/17 -O2 and -O0 option lead to different string length.


#include <stdio.h>
#include <wchar.h>

typedef const wchar_t*  LPCWSTR;

typedef enum nihar_types_t {
 NIHAR_TYPE_1 = 1,
    NIHAR_TYPE_2 = 2,
    NIHAR_TYPE_3 = 3,
 NIHAR_TYPE_4 = 4,
    NIHAR_TYPE_5 = 5,
    NIHAR_TYPE_6 = 6,
 NIHAR_TYPE_7 = 7,
    NIHAR_TYPE_8 = 8,
    NIHAR_TYPE_9 = 9,
 NIHAR_TYPE_10 = 10,
    NIHAR_TYPE_11 = 11,
    NIHAR_TYPE_12 = 12,
 NIHARE_TYPE_INVALID = 0,
} nihar_types_t;

typedef struct g_name_t {
    nihar_types_t objtype;
    int dirtype;
    LPCWSTR name;
} o97_name_t;


static const g_name_t on[] = {
 {NIHAR_TYPE_1, 2, L"/nihar/test/dir1"}, 
  {NIHAR_TYPE_2, 4, L"/nihar/test/dir1/dir3"},  
  {NIHAR_TYPE_3, 6, L"/nihar/test/dir1/dir3"},  
  {NIHAR_TYPE_4, 5, L"/nihar/test/dir5/dir4/dir1"},  
  {NIHAR_TYPE_5, 1, L"/nihar/dir1"}
};



int fun1(LPCWSTR ke)
{
 int     dirtype;
        int     errcode;



        LPCWSTR tmp = ke;
        int lt;

        // word97
        int objtype = 3;
        dirtype = 1;
        lt = (int)wcslen(tmp);

 for (int i = 0; i < sizeof(on) / sizeof(on[0]); i++) {
            int matched = 0;
            LPCWSTR mask = on[i].name;
            int lm;

            lm = (int)wcslen(mask + 1);

            if (lm > lt) {
                continue;
            }

            if (wcsncmp(mask + 1, tmp + lt - lm, lm) == 0) {
                matched = 1;
            }

            if (matched) {
                if (wcscmp(ke, L"/nihar/dir1") == 0) {
                    objtype = NIHAR_TYPE_6;
                    dirtype = 0;
                } else {
                    objtype = on[i].objtype;
 dirtype = on[i].dirtype;
                }
                break;
 }

        }
    return dirtype;
}

int main()
{
    int ret =  fun1(L"/nihar/dir1");
    printf("Out : %d", ret);
    return 0;
}

Analysis given below:
===================================================================
 Compiler has optimized the memory usage and tried to store "/nihar/dir1" in smaller area[~12 bytes] with assumption the sting will never changes in its lifetime. During access of string, it has no problem as compiler knew what it has done.
This a complier bug.

The bug reflects here when we did
LPCWSTR mask = on[i].name;
And we pass mask + 1 to wcslen
lm = (int)wcslen(mask + 1);

As sizeof(wchat_t) = 4
The pointer (mask + 1) has moved 4 bytes ahead in constant character array and asked wcslen() to calculated length from there.

If we see the assembly code of the test program[That I have simulate our code], the general-purpose register comparing wrong [length of string return from wcslen] is always < 4 as it moved 4 bytes ahead in a character string[all due to optimization]

This is clang15 memory optimization bug, same also exist in clang17.

The test program in image, you can see clearly there are two different result [1 for -O2 and 0 for -O0] clang build and run.

I have attached both test program, assmble code and result. 
[clangBug_programe.txt](https://github.com/llvm/llvm-project/files/12416473/clangBug_programe.txt)

![clang_memory_optimization_bug](https://github.com/llvm/llvm-project/assets/13232343/7d3ba7a6-867c-4809-8b42-ab3f66d860b5)

===================================================



</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzcWF9v47gR_zTMyyCGTUn-8-CH2N6gARa7h2vQQ58MShpJ3FCkQVLx-R762YuhJFt2rHR7KFDghIWc1Qx_85dDzgjnZKkR1yzZsGT3IBpfGbv-Jithw-shNflpzaY7Nn36rUINR4RKvCMIcF54mUFmtPNg0h-YeSJJXbbfhPZwlDmC81bqcgJf0TvGtwOIY1YJu_eMP8HBSO3RMr48tvz7g7eMr1oCeEPyrAdTdBJ7VFkQYCZU1ijhERTq0ldgikAH4eGYOYX6GhkY38CM8dUEXqvBog4WjlIpyGVRoIWj9BVkSugSHr9zYMnGom-sDnzW6BLehWqQJTs4VmgRhAN4_D49M0JmrCX39HwTaD0K3e9LASkqcwSPzsPBmtKKGiEz9UEqDAqw6KlbEzSZJUEXc_DSaGDRE-QNkpdqrI09BUIt_xBEJp-3kSB39xZ2JsOtLe13lmxAoXNowVdCQ0zYIvONUGeO3aTVqDe6t-mi4HSooPQg3UBc75RBjnTQHW77Tk3v_VnC-PNsEQwXOh_iKxQ5adhGDLX_BLB780jqTDU5Aou2zufSTCoWfblHDWk6pIa3Px0wxz4dB6kMX3_Z_vb3119ZtLnHj7qpQcvAfjqg23tgi44Tvr387enX_es_f_mynwGLdpSj296xMCTzQOZj5CiQowt5QIsDLR5bmgRyMkaeB_L8LvIi0BZjS5eBvBwjrwJ5dRd5Nm3dMR1bPOv8NeqwWeuxGb_B_9KSX7794-nryy7wXISwxe46ViNBdd42mYdyr0WNVxEFuAm2SX_Qn2cg4qASl0v74XuXSUCoF8mLHZjVohN1o1D7vqrNZ6WMbst8sPGiIVtshm6ickE-gq-Mc8afg_aMP1NtYvw5l3ZGhMWOeM517AojLI__Ewb9RBeoESxK4ZBt_wOsoFLyKVbS_sQfLB2BDGizO5CD1X3U7oaqfVMCFI2eMb7sY_6GjK_6tX2siI-ee8nSJxI9aG1mcvxEYr-iF-frQ8iLt_ug6jbReiLjz4w_w9HYfLX4uK7L9q4a3SJ3ZnSV7paqfJupfCm1Z3x1Psh9fSDf3ChUGNvxguy2cbQJf27ByT_QFIwv6ThckdJXn5LNlCW7FhMk45vwb3W9jYdm1cJnFeZnMR-5er_Wwr0FtiBG0rF5tZtvkVU94ujgkXrEI62U7k7zCYAsaHGA-UIxHbORnsxoL3VzX9VLXo_IOGZOZxSnK922bZ7xDQX3kYzl2_AOtb-rvZ_oNPT7x4T5KcU6iE_FnE1oLaCtOLrBf1Z1eoa7YXik3rWkf4ab5H6udXYDKoc_r8AlIT8cSUORF7axovPB9beE1KJ4uywbCdHV8u7mfCvyZmm7FSXtgDvFsttRFts6ci6wY3G8MuxgpfZFAObfGx-usIwneWDdEujtik7l6ZiyT1qok5MOSvmOur3xn6_1bQr9Ff61_ti2vYuFSri-G8EcfIV9i9I4UWK4yHsrMW9bPGMR7oYHpAZXC0WQwqJgyeZfMw7pyaMLnRc1acK5pm47AhLk_LmV0_iOlvofXaIjLOkdKFmglzVOYNe0zWKWoXOXNpACLX0wQRvqy1KFNbV3WW_dm8YjHCvhe77caOzajddKOhCBV0m0kDblVSNCjWfalGCxUJh5B6GPOnZddi7zlu1nD5InndO6g3AOLiWX3NqdEoHrT50gT-5yWlKj4_e-K3sQX2zpWni4BQt-qc075hC3AQNRUcMm9WVSQO2TyHyIrhWnkBfCvWF-ad8JyptLq5_3PWxhTU0Bt3jl35cwGXCIIRmEc1in6gR0L6IQ08dhv82SzSsF8qWdTjhZtwMF09iwJlwQtmFZiRqtUI-Hxh6MQ7BYSke6U6yFvTTTLNl8GC10ZSIo3dmW7Kg1FuooTi5cV2JKMunHvCYG7uoyNdkIpfohwFX3n-yuk046EtY36XfmBZSUZKkTNYJQzgD-Lp0P4QqrFh_SeOjHsLtqUYYz82QayIQOUcgUCqtObaRoE4M_Dlt2i66hK1-ymYW7XN_nT7v_0RWtm8OkjVR5INpGXwe9my15L8JFIYwPrsLMt5QLdaqwTYWAEkT3YxmWbIKYTVPu-2HMxP_uQwYsK-8Pjup2uPiW0ldNOslMzfizUu_9z-PBmh-YUVdRSIWO8ecZj2fzeBEx_nwf_nyCdTOIWa_Ivo3SfhilPUXpzylEW8EHjSIe8SgmjRZ5lIqFmD8u54vsMV5OV4_LNOaPIo2K-Txfzqdpcqvg__2o-fQIGr4f8nWUr6KVeMD1bL7ii8Wcx_OHaj0TPE_zPMtynCbRKk5XAldimaaxmMcp8ge55lMeTZc8mi6iBZ9P5nkxL6KomEbzZRLlBYunWAupJuTkibHlg3SuwfU8XnH-oESKyoXxKud0VgQinXDJ7sGuQ2DSpnQsnirpvLugeOkVrkP4aUe0sy_aAfd2rDfhoPk4dQ2TMUf0ruxI3c_dusIk9WCSOHlorFr_1wkVjKKECkb_OwAA___m2lp9">