<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/69294>69294</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Pointer Dereference Optimization Bug in Clang-18 on ARM64 Depending on Data Patterns at Different Optimization Levels
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          gyuminb
      </td>
    </tr>
</table>

<pre>
    ### **Description:**

When compiling the provided PoC on ARM64 architecture with Clang-18, there seems to be a pointer dereference optimization issue. The behavior of the code changes based on different optimization levels, and it's influenced by the data patterns used as well as the structure of adjacent **`printf`** calls. For some data patterns, the issue is observed across optimization levels **`O1`** to **`O3`**. Intriguingly, when replacing two identical **`printf`** calls with two distinct ones before and after the problematic line, the issue exclusively appears in **`O3`**. It suggests that the optimization is influenced not just by data patterns but also by the presence and structure of adjacent print functions.

### **Environment:**

- **Compiler**: Clang-18
- **Target Architecture**: ARM64
- **Optimization Level**: This issue is noticeable at **`O1`**, **`O2`**, and **`O3`** depending on the data patterns used. For patterns like **`0x123456789abcdeff`**, the issue can be observed from  to , but for patterns like **`0x1234567fffffffff`**, it exclusively appears at .
- **OS**: Ubuntu 22.04.2

### **PoC:**

```c
cCopy code
#include <stdio.h>#include <stdint.h>struct StructA {
   uint32_t val1;
   const int8_t  val2;
   uint64_t  val3;
   uint16_t val4;
};

union UnionB {
   uint32_t  u_val1;
   struct StructA  s_val;
   uint32_t  u_val2;
   int32_t   u_val3;
   int32_t u_val4;
   uint64_t  u_val5;
};

static union UnionB main_union = {1UL};
static uint32_t *ptr_val1 = &main_union.s_val.val1;
static uint32_t **double_ptr = &ptr_val1;

static uint32_t ***triple_ptr = &double_ptr;

int main() {
    printf("Before main_union.u_val5: %lx\n", main_union.u_val5);

    uint32_t **local_double_ptr = &ptr_val1;
    uint64_t local_val = 0x123456789abcedffLL;
    uint64_t *local_ptr = &main_union.u_val5;

    (*local_ptr) = local_val;
 (triple_ptr = &local_double_ptr);
    (***triple_ptr) = 0UL;

 printf("After main_union.u_val5: %lx\n", main_union.u_val5);

 return 0;
}

```

### **Expected Behavior:**

The value of **`main_union.u_val5`** should be consistent across different optimization levels after the pointer dereference operation.

### **Observed Behavior:**

he value of **`main_union.u_val5`** changes depending on the optimization level, data patterns, and the structure of adjacent **`printf`** calls.

### **Analysis:**

The optimization seems to overlook the **`(**triple_ptr) = 0UL;`** operation. The discrepancy in output, depending on the structure of **`printf`** calls and data patterns, indicates a misoptimization during the compilation process. Notably, when changing the structure of the **`printf`** statement or using a data pattern with repeating digits, the issue singularly appears in **`O3`** optimization level. This brings to light the complex nature of this optimization bug that is sensitive to both the data patterns and surrounding code structures.

### **Steps to Reproduce:**

1. Compile the PoC code using Clang-18 on ARM64 with various optimization levels (**`O1`**, **`O2`**, and **`O3`**).
2. Execute the compiled binary.
3. Observe the inconsistent behavior dependent on optimization level, data patterns, and **`printf`** structure.

### **Evidence:**

The following output showcases the behavior for various optimization levels:

```

O0 Output:
main_union.u_val5: 1
main_union.u_val5: 1234567800000000

O1 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff

O2 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff

O3 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff

```

What's intriguing is that when we replace two identical **`printf`** calls before and after the problematic line with two distinct **`printf`** calls, such as:

```c
printf("Before main_union.u_val5: %lx\n", main_union.u_val5);
```

and

```c
printf("After main_union.u_val5: %lx\n", main_union.u_val5);
```

the issue only manifests at **`O3`** optimization level.

### **Conclusion:**

Across different optimization levels (**`O1`** to **`O3`**), there is a clear evidence of a bug likely resulting from incorrect compiler optimization. The unique scenarios under which this bug emerges, especially when altering the **`printf`** structures or data patterns, further underline the unpredictable nature of this issue. This bug certainly requires attention to ensure consistent and correct behavior across all optimization levels.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0WNuP2jwW_2syL0eNwGGY4YEHBjrSSt1t9bXV9zhynBNwa-x8vjDD_vWr44RcINB21Y4QtL6c--V3zJ2TW424TO6fkvvNHQ9-Z-xyewx7qfO73BTHZcKy-gMJWyVstUEnrKy8NDrJVvVaMtkkk-b77x1qEGZfSSX1FvwOobLmIAss4JNZg9Gw-uvf8xlwK3bSo_DBIrxKv4O14nr7bvqYsDXdswgOce_AG8gROFRGao8WCrRYokUtEEzl5V7-l5M8IJ0LmMKXHUKOO36QxoIpowzCFAhix_UWHeTcYUGSFLKMhPyQjsIDKkdicF2A9Al7cCB1qQLxLCA_RpoF9xwq7j1a7SAQTe7gFZWiXzrhvA21gqYEXnzjgng1RptPKiu1L5P5pF4BwZVyKTwbC87szxg0VqmVBOnA5A7tgZgKa5wbU6Fj9XHasfGmt5616yn8S3srt0HqrToSu1fypcVKcRF9-WpAFqi9FFz9QIvao3SjkM5LLTwYTabH0liMduUlObMJkFzhnnspQEmNQ1XxTajg5AHVEXhVIbfkjGsqeHBhu0XnyQPcRzpnQdJ3pTYevgXnyadDf-bBA1fOnLxdWXQx5Ej2ccdGS0AZtCBOLu3nxXkevdcHaY3eo_ajefSuObeOqYS2OZGtujQZnPvC7RY9rHpZ1V2JGTc8_7Fvkw8ULd3xLzsy0inOtPFSIM8VAvejEUX-6tbZYJ2sNeYqKLBCXVBcGX0lnepUaJeU_I4drcnblGWz-_nD44LnosCyHPDt4kdwTfWjTZfSmj3USbCOXi5_gkl5-hswkX40OrmH9Mzanzvrfs2D9gEYSyezlN2KkU9mPRobJEP8iPr_Ym2qYyxxLR2phQoFQpKtnS-kSXdJ9v5yXfu4UYczfI4_K0genmpCABCk9hl78XDgappk3YYw2nmQ2j--eKBd1t-la_NZs5Od70znNcFZu5E8bLp_x--gKTK_0vfTuEQQXs6FOlMEHJ045967PJC53ar3srG9uDMb1zTu3d9QyflY4gaa7bnUL_VKkm1I0enXD_2rp0snCRK2qryNmtc32LyjkUaF04FZRggkbFWYkCt8qbw9UTlRvSL08HrCVt7Kakiho3lGg8oiCZmwx4QtBt6Epn3QDnuqu0NPn5NNV5Cwe_WW3K91whjl3uUhtjhjC32H10IrI7h6-Qnloe_a-taBul62gWHlwaIsP3wYv9cy7HEaUe5C6GiN7mq0WbbpxOi4Jezx0g_nSvYt05E_9-KJy-Trh3OR-j5axbb921xk0QerYTJIm7Fad7OZvlUoPBbw1MC-0bJJuPDAVYhNuy3xl0K2LcrtTFAFNQ8qdtJ5avIN2roJHfvYZhS0oo3HbyKEj6eGdVOpX9XphIEv2u-lGuTCCwhK_fz_hra31F1pro5OuquuGwjYTgbmgFYZ8z0K1bJvI_xqeLeSdc6Ig0MhnbBYcS2OhDJN8FXw0RLnBhtY4AdomKx2YUqpCym4Rwcc9tIN9CuCPY1P9TBVL1fWCHQuhf8Yz_MeSo9uPd0YSDawy7l0VNxxH6PYQnBEgA8ErXG8xQq5p91CbqU_m0XoWlDc_hifjwRZWqPNnNSNDlVyu_Ot4grfQPNOF3k25-RhW-N86cChdtLLA8aJ0dD8cQErI3gP1ppQ-zLOha29bkfoZ49VFPEvrKwpgsDRWJ2m0ID2yJ-G3silNu8JvXeDcLTwgVtpwrUh7vF3oe6ELRoVWQrv31AEj70go9lWam6PzaEshaYK1e7WvTrYTth1YsQY0r9SRW4EZeOO2wPUgWbRKz6gTC6NUuY1JmxMYirnr4I7rGfzVn6C_zfMT_R_1I8-TuBjXShOh0f74_TWXgMpJs3fgPz0t5FvZ6U-efZnyWd_lPyoQ_7e8dOrzelJgypErBSxXr5i87CBv_as8VPvFyOPH7fJUkq4IHbAr4ZbM-r9GbQ8ZkKui58V5LdBwjE5uj5jtDrCnmtZxuedwWPE7R5zq46sjY4T_JX3zNXPwL0rFfraSxvp3L5xSmr_QiG3gE1Ji5AqtjYlv6M6gkUXVOy_8fGC6rC1KPypbNuBVDWQCVr-Q81ZoKbi5iDoAi287qTY1Y2UGOAe7RZjAKKrUEiu1LFOEa48tiDkx8XaEYS4KPRlsKRmzTzmho-yVRYLKXx8VTrr7-0rbiOiQOu51NEM_wRJnIiBjg7wBlA7ut6H6LqAk4XaMt_gdq7UmAvTu2KZFYtswe9wOZ0vHu7ZwyOb3-2WOc8Ws-IB52WOi3nOcZ4vpiVfZGy6KCdicSeXbMKy6WT6MHnIJmyeLsQsZwVydl_gbMHnyWyCey5VqtRhnxq7vYsqLucLtpjdKZ6jcvEJnjGNr7X-lC73mzu7pDvv8rB1yWyipPOuo-KlV7j81AwYm96AMXjcewpbwmOXwGPTR7Qb8tynFiZ52LQRf_lU6O6CVcud91UN158T9ryVfhfyVJh9wp5JxubnXWXNNxQ-Yc9RM5ew56j5_wIAAP__8zecRg">