<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/138526>138526</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[clang] Warning for comparing `char8_t` to `char32_t`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
clang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Eisenwave
</td>
</tr>
</table>
<pre>
Consider the following code:
```cpp
bool contains_oe(std::u8string_view str) {
for (char8_t c : str)
if (c == U'ö') // comparison always fails
return true;
return false;
}
```
If `str` is a correctly encoded UTF-8 string, the comparison always fails because no UTF-8 code unit can be `0x6F`, and `ö` is U+00F6. Comparing `charN_t` with different `N` is virtually always a bug, or could have just as well been written using a different type of literal. Comparing these types is not going to give meaningful results except for U+007F and below, and even then, it's unclear why you wouldn't use the proper type.
I've floated the idea of deprecating this behavior in the C++ standard in a number of places, and it was received positively. StackOverflow users also suggested getting rid of it here: https://stackoverflow.com/q/79604433/5740428
**In the meantime, it would be useful to have a warning when `charN_t` is converted to a different Unicode character type.** This warning should be triggered for any implicit conversion, not just as part of a comparison because the same bug can be produced like:
```cpp
bool contains_char(std::u8string_view str, char8_t c);
// ...
contains_char(U'ö');
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyMVcGO4zYM_RrlQmygyIntHHJIMg2wl-2hO-hxIMu0rV1FciXa3vx9QdvZzg5QtIaBBDRFPj4-Ujol23rEkzhcxOFlowfqQjz9ZhP6SY-4qUL9OF2DT7bGCNQhNMG5MFnfggk1iuws5FnkcnlN3wt5rkJwYIInbX16CyhUmahm1-w8lImi9e3baHGCRFGoI4jiIuQZAKAJEYQqTadj-UZgQGTn1Wv14Mc2sxOI7EVkL_AqVCGumbjk_IfjqZtQNzDh3utoU_Cg3aQfCRptXXoXiJ-INEQPFAcU2RPHamy0S6tVFC_vKxXy_LkBkUsGl0uwCTSYECMacg9Az-zU8Pr19qmEpWShrjOD_wILKjR6SAg-rMc4BAzeEhjtoUJOJ3_kN06vrqB9zZa18gXDq1AXKW_5Fq5LFt-yD_P55Y3YabLUQW2bBiN64o9f1rOjjTRo5x5PWBqqYUYdIpgwuBo6PSJ8GxKBTjChc1AhepiiJUIPQ-J8-l14evQIoQFnCaN272FRhwlnh8TZfSBow_whQGtHhDtqb33bDA4ipsFRAvxhsKdZJUulxW2moUIXpicnOKLn6J4NloQqEgzeONQRpu4BjzDAxOV4oQoCppzb0sfQs8YfPW650_L8WahiRGhc0IT17GRr1FxPjX1Eo2kpxHLzOj3aEMHOueEq1EWoCyTSvtaxZrsGP9wrjBygd9pgekK2BJNOENGgHbGGPiRLdkT32MIfpM3330eMjQsTo40JtEsB0tC2mBhZizQjibbm2Jagw8izCR1Rn3jy5olIHCqsobYm3IW6_SXUrTjmcr_PMqFuh2Iv96pcCBCK389LRdwOsndcWF0YZFEOCblFFBZ1aJh05L7B1KH_ID6beC-MGGc-wy9SefV2Fjy7a0M_WzFjgK9M8jNy6p7JKdq2xYj1rAntH2DvvbOGZ2ZOlGyYdcDyegq315GYJ_1-FJ_Tx6UmfUfW_nPu-hjqwWANzn7_PyuPS_iPpXeFnzuOd9uyYpa1td2y_j4G-7jk1iNPGJv6lNXH7Kg3eNoV-3y_Px7kbtOdGrM_qqbOZCmbrNRKVXm9K4uskllVVHK3sScl1UEe5GGnDrv9botllamqlId8f8zL3UHsJd61dVvnxvs2xHZjUxrwtMvKg8o3Tlfo0nyDKGWc5kWn-DKJJz7wqRraJPbS2UTpnxBkyc3XznLi8AJ_rs1t5m3zYXmVi34oPA2Zmi2bIbrTryJvLXVDtaqb860_n_oYvqEhoW4z_iTUbS1hPKm_AwAA___P7kTV">