<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/55864>55864</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Non-UTF8 output with `-fsanitize=undefined`
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          tgeorg-ethz
      </td>
    </tr>
</table>

<pre>
    For a university project where we test the reliability of various sanatizers in clang we came across a case in which the _UndefinedBehaviorSanitizer_ produces non-UTF8 output. In particular when compiling a specific C source file with `-fsanitize=undefined` and then running it, the output produced by the sanatizer contained non-UTF8 characters.

Screenshot of the error in action:

![screenshot](https://user-images.githubusercontent.com/55834264/171940991-536f9213-3284-4248-b43c-ebbcc3cb2b33.png)

Newest version we found to be affected:

```
Ubuntu clang version 15.0.0-++20220530052901+b7d2b160c3ba-1~exp1~20220530172952.268
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
```


<details>

<summary>Older versions also affected</summary>

```
clang version 10.0.0-4ubuntu1 
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
```
and
```
clang version 13.0.0 (Fedora 13.0.0-3.fc35)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
```
and
```
clang version 14.0.0 (Fedora 14.0.0-1.fc36)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
```



</details>


The used C source code:

```c
#include <stdint.h>
uint64_t a = 4073709551615;
static uint64_t b[2][2][9] = {&a, &a};
int16_t c;
uint8_t d;
uint64_t e;
int32_t f = 5;
int16_t g(int16_t, uint8_t *, uint32_t, int8_t, int64_t);
uint16_t i() { int32_t j = g(0, d, j, e, 0); }
int16_t g(int16_t k, uint8_t *l, uint32_t m, int8_t n, int64_t o) {
  for (;; c = c + 1)
    b[k][f][c] = &a;
}
void main() { i(); }
```

Compiled with the command `clang -fsanitize=undefined test.c`


We also found that simply changing the value of the global variable `a` from

```c
uint64_t a = 4073709551615;
```
to
```c
uint64_t a = 1;
```
causes the non-UTF8 output to no longer show up:

![screenshot](https://user-images.githubusercontent.com/55834264/171941059-c3122192-e6ce-4471-a778-0d8fec5cef5a.png)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzNVktz2zgM_jXyhSONRL0PPuTRzPTSPbSdPWYokpKY0qSHDyfpr1-Qkh3HTbo5bdfjISkCBD6AIIBBs-ftnTaIIK_EgRsr3DPaG_3AqUOPMzccPXLkuHXIzRwZLgUZhAxcekQHYoT2FlmiiBM_4TgSClFJ1BSOUbLjiFCjrQUFlFgeyI-zoHOUdv9dMT4Kxdk1n8lBaPOVKBEF3QcQzFNukdIq_f7trkPau713Gfqs0J4YJ6iXxASQoFLv9oAK1BJk95yKUVB0g6z2hnI0CglmCDejpMnT0a5KkvLWHwEAARHFAiyFjFcqyBIuwTcR6aL6iImh4Tlun-wGAMqRIOgFLp2JIdSBU7Ikv03yq2X8Sg3nys7aBQ8GKdwYuAHwDHALrZLy6pw_wUVSX9vTqaS-TXA3O7e3gRPfwd9bblKxIxO32QR2-iHsBExcuQycAzx13ZUVbipYFm3RV3nfF2ldNmOPizItcVelFa66dKhKmvJhoLSkAx7KMturKcH9OaYv_DFERIwXrcJVj9oH72k0wI2PI0QP-PTCkCZf__Hz--CV82uwHCUVdZZneZrga_jjHOO8LvO8xn1ewMbQMjwUTU7LgaRF0n7iT_swHRmLFvc1znDTLSq-ETNxBzDQU9fcN1W6pykEiX9KJ-VXltlwwtBOMy4D415b8bSQPivriJSc3QoTSNHPBsZBqDctejWWN4xDREi4pE8XBOt3O2KegfCXZBA7q_HwRqTVZ967AV0vvL9x5YUP8-jDykcHF-iP-ALe0keglgEqSOvuONOGrBtpmY20rE9B9wt2w9lM3P8Af3WJP26kRcDf_En8b0Yk8L8TlKt-jiBvsJfESQHKe6-YHtNTKRSVnnEUQtsxARlnPsn38NlU9w7yMqRbVOVt2eZ9XRdNUSfl9cIEtkE2RyfeARIeDnnuNPUwRQFJC5mhISEvx7m9PUmB00UDp-lpJwjsYIe92okq-PmxEsPOGOXXv4iD3Net66D1KDPBV8fPcDysF8q6ClpCAJxrjvIEyANCsAQddT9E3UFTHo6zMDyEgYchX-SgYOx70NCPC3DyHB3aveBD6gwh0iuURTCCPG5COAfYoJFGXDDia1ScwhnBL1zRj-VuxmWipysK93I0-4T5oAWENpTIc_OX9Svb3ozjm1jeITJjEQ81E2raLtTrEIjxSb5d1mPnktE3n8XffEm4a-mCB4ms2O3lc6jcagoNQNB0INLzY6mepB6IjH0PGaCpAMEktA6jgRr7m1fyoWdwYbrTH5JVvHeeEnjMNsK-aKFCnVYaSa0mKD_QVDwiv_-P2o4ir_uUlgXGRY9T3lCeVlVbpKRtuzRnHRS_mvKxJmvbsWHbkvVlTzZOOMm3Xy5M-de2buON3L7GvWBdAUp5OE7p2vjCp7DWc7vgb6rNvG1o03VNU7Cetk3b0WEATIzVIx5LnFfDRpKBS7sFpyUYK_6IoghYg-c2Yht7lCYvi77uqiobq7xuuzqvoLeBtqtOqpzD65BZwJFpM23MNkIa_GSBKIV19oVIrBWT4jyqA_nEu1mbrZs4UFPu5p-bqH4b4f8DQY57FQ">