<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/69294>69294</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Pointer Dereference Optimization Bug in Clang-18 on ARM64 Depending on Data Patterns at Different Optimization Levels
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
gyuminb
</td>
</tr>
</table>
<pre>
### **Description:**
When compiling the provided PoC on ARM64 architecture with Clang-18, there seems to be a pointer dereference optimization issue. The behavior of the code changes based on different optimization levels, and it's influenced by the data patterns used as well as the structure of adjacent **`printf`** calls. For some data patterns, the issue is observed across optimization levels **`O1`** to **`O3`**. Intriguingly, when replacing two identical **`printf`** calls with two distinct ones before and after the problematic line, the issue exclusively appears in **`O3`**. It suggests that the optimization is influenced not just by data patterns but also by the presence and structure of adjacent print functions.
### **Environment:**
- **Compiler**: Clang-18
- **Target Architecture**: ARM64
- **Optimization Level**: This issue is noticeable at **`O1`**, **`O2`**, and **`O3`** depending on the data patterns used. For patterns like **`0x123456789abcdeff`**, the issue can be observed from to , but for patterns like **`0x1234567fffffffff`**, it exclusively appears at .
- **OS**: Ubuntu 22.04.2
### **PoC:**
```c
cCopy code
#include <stdio.h>#include <stdint.h>struct StructA {
uint32_t val1;
const int8_t val2;
uint64_t val3;
uint16_t val4;
};
union UnionB {
uint32_t u_val1;
struct StructA s_val;
uint32_t u_val2;
int32_t u_val3;
int32_t u_val4;
uint64_t u_val5;
};
static union UnionB main_union = {1UL};
static uint32_t *ptr_val1 = &main_union.s_val.val1;
static uint32_t **double_ptr = &ptr_val1;
static uint32_t ***triple_ptr = &double_ptr;
int main() {
printf("Before main_union.u_val5: %lx\n", main_union.u_val5);
uint32_t **local_double_ptr = &ptr_val1;
uint64_t local_val = 0x123456789abcedffLL;
uint64_t *local_ptr = &main_union.u_val5;
(*local_ptr) = local_val;
(triple_ptr = &local_double_ptr);
(***triple_ptr) = 0UL;
printf("After main_union.u_val5: %lx\n", main_union.u_val5);
return 0;
}
```
### **Expected Behavior:**
The value of **`main_union.u_val5`** should be consistent across different optimization levels after the pointer dereference operation.
### **Observed Behavior:**
he value of **`main_union.u_val5`** changes depending on the optimization level, data patterns, and the structure of adjacent **`printf`** calls.
### **Analysis:**
The optimization seems to overlook the **`(**triple_ptr) = 0UL;`** operation. The discrepancy in output, depending on the structure of **`printf`** calls and data patterns, indicates a misoptimization during the compilation process. Notably, when changing the structure of the **`printf`** statement or using a data pattern with repeating digits, the issue singularly appears in **`O3`** optimization level. This brings to light the complex nature of this optimization bug that is sensitive to both the data patterns and surrounding code structures.
### **Steps to Reproduce:**
1. Compile the PoC code using Clang-18 on ARM64 with various optimization levels (**`O1`**, **`O2`**, and **`O3`**).
2. Execute the compiled binary.
3. Observe the inconsistent behavior dependent on optimization level, data patterns, and **`printf`** structure.
### **Evidence:**
The following output showcases the behavior for various optimization levels:
```
O0 Output:
main_union.u_val5: 1
main_union.u_val5: 1234567800000000
O1 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff
O2 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff
O3 Output:
main_union.u_val5: 1
main_union.u_val5: 123456789abcdeff
```
What's intriguing is that when we replace two identical **`printf`** calls before and after the problematic line with two distinct **`printf`** calls, such as:
```c
printf("Before main_union.u_val5: %lx\n", main_union.u_val5);
```
and
```c
printf("After main_union.u_val5: %lx\n", main_union.u_val5);
```
the issue only manifests at **`O3`** optimization level.
### **Conclusion:**
Across different optimization levels (**`O1`** to **`O3`**), there is a clear evidence of a bug likely resulting from incorrect compiler optimization. The unique scenarios under which this bug emerges, especially when altering the **`printf`** structures or data patterns, further underline the unpredictable nature of this issue. This bug certainly requires attention to ensure consistent and correct behavior across all optimization levels.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0WNuP2jwW_2syL0eNwGGY4YEHBjrSSt1t9bXV9zhynBNwa-x8vjDD_vWr44RcINB21Y4QtL6c--V3zJ2TW424TO6fkvvNHQ9-Z-xyewx7qfO73BTHZcKy-gMJWyVstUEnrKy8NDrJVvVaMtkkk-b77x1qEGZfSSX1FvwOobLmIAss4JNZg9Gw-uvf8xlwK3bSo_DBIrxKv4O14nr7bvqYsDXdswgOce_AG8gROFRGao8WCrRYokUtEEzl5V7-l5M8IJ0LmMKXHUKOO36QxoIpowzCFAhix_UWHeTcYUGSFLKMhPyQjsIDKkdicF2A9Al7cCB1qQLxLCA_RpoF9xwq7j1a7SAQTe7gFZWiXzrhvA21gqYEXnzjgng1RptPKiu1L5P5pF4BwZVyKTwbC87szxg0VqmVBOnA5A7tgZgKa5wbU6Fj9XHasfGmt5616yn8S3srt0HqrToSu1fypcVKcRF9-WpAFqi9FFz9QIvao3SjkM5LLTwYTabH0liMduUlObMJkFzhnnspQEmNQ1XxTajg5AHVEXhVIbfkjGsqeHBhu0XnyQPcRzpnQdJ3pTYevgXnyadDf-bBA1fOnLxdWXQx5Ej2ccdGS0AZtCBOLu3nxXkevdcHaY3eo_ajefSuObeOqYS2OZGtujQZnPvC7RY9rHpZ1V2JGTc8_7Fvkw8ULd3xLzsy0inOtPFSIM8VAvejEUX-6tbZYJ2sNeYqKLBCXVBcGX0lnepUaJeU_I4drcnblGWz-_nD44LnosCyHPDt4kdwTfWjTZfSmj3USbCOXi5_gkl5-hswkX40OrmH9Mzanzvrfs2D9gEYSyezlN2KkU9mPRobJEP8iPr_Ym2qYyxxLR2phQoFQpKtnS-kSXdJ9v5yXfu4UYczfI4_K0genmpCABCk9hl78XDgappk3YYw2nmQ2j--eKBd1t-la_NZs5Od70znNcFZu5E8bLp_x--gKTK_0vfTuEQQXs6FOlMEHJ045967PJC53ar3srG9uDMb1zTu3d9QyflY4gaa7bnUL_VKkm1I0enXD_2rp0snCRK2qryNmtc32LyjkUaF04FZRggkbFWYkCt8qbw9UTlRvSL08HrCVt7Kakiho3lGg8oiCZmwx4QtBt6Epn3QDnuqu0NPn5NNV5Cwe_WW3K91whjl3uUhtjhjC32H10IrI7h6-Qnloe_a-taBul62gWHlwaIsP3wYv9cy7HEaUe5C6GiN7mq0WbbpxOi4Jezx0g_nSvYt05E_9-KJy-Trh3OR-j5axbb921xk0QerYTJIm7Fad7OZvlUoPBbw1MC-0bJJuPDAVYhNuy3xl0K2LcrtTFAFNQ8qdtJ5avIN2roJHfvYZhS0oo3HbyKEj6eGdVOpX9XphIEv2u-lGuTCCwhK_fz_hra31F1pro5OuquuGwjYTgbmgFYZ8z0K1bJvI_xqeLeSdc6Ig0MhnbBYcS2OhDJN8FXw0RLnBhtY4AdomKx2YUqpCym4Rwcc9tIN9CuCPY1P9TBVL1fWCHQuhf8Yz_MeSo9uPd0YSDawy7l0VNxxH6PYQnBEgA8ErXG8xQq5p91CbqU_m0XoWlDc_hifjwRZWqPNnNSNDlVyu_Ot4grfQPNOF3k25-RhW-N86cChdtLLA8aJ0dD8cQErI3gP1ppQ-zLOha29bkfoZ49VFPEvrKwpgsDRWJ2m0ID2yJ-G3silNu8JvXeDcLTwgVtpwrUh7vF3oe6ELRoVWQrv31AEj70go9lWam6PzaEshaYK1e7WvTrYTth1YsQY0r9SRW4EZeOO2wPUgWbRKz6gTC6NUuY1JmxMYirnr4I7rGfzVn6C_zfMT_R_1I8-TuBjXShOh0f74_TWXgMpJs3fgPz0t5FvZ6U-efZnyWd_lPyoQ_7e8dOrzelJgypErBSxXr5i87CBv_as8VPvFyOPH7fJUkq4IHbAr4ZbM-r9GbQ8ZkKui58V5LdBwjE5uj5jtDrCnmtZxuedwWPE7R5zq46sjY4T_JX3zNXPwL0rFfraSxvp3L5xSmr_QiG3gE1Ji5AqtjYlv6M6gkUXVOy_8fGC6rC1KPypbNuBVDWQCVr-Q81ZoKbi5iDoAi287qTY1Y2UGOAe7RZjAKKrUEiu1LFOEa48tiDkx8XaEYS4KPRlsKRmzTzmho-yVRYLKXx8VTrr7-0rbiOiQOu51NEM_wRJnIiBjg7wBlA7ut6H6LqAk4XaMt_gdq7UmAvTu2KZFYtswe9wOZ0vHu7ZwyOb3-2WOc8Ws-IB52WOi3nOcZ4vpiVfZGy6KCdicSeXbMKy6WT6MHnIJmyeLsQsZwVydl_gbMHnyWyCey5VqtRhnxq7vYsqLucLtpjdKZ6jcvEJnjGNr7X-lC73mzu7pDvv8rB1yWyipPOuo-KlV7j81AwYm96AMXjcewpbwmOXwGPTR7Qb8tynFiZ52LQRf_lU6O6CVcud91UN158T9ryVfhfyVJh9wp5JxubnXWXNNxQ-Yc9RM5ew56j5_wIAAP__8zecRg">