<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/97741>97741</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            -Winvalid-token-paste fails to catch UCNs which are invalid preprocessor tokens
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          jeffgarrett
      </td>
    </tr>
</table>

<pre>
    Consider ([godbolt link](https://godbolt.org/z/c1sGTM6WK)):

```cpp
#define X \\

#define U(x) x ## u0000
#define Y(x) U(x)
#define Z(x) #x
#define W(x) Z(x)

const char str[] = W(Y(X));
```

clang preprocesses it to `const char str[] = "\u0000";` but the UCN `\u0000` is an invalid as a preprocessor token per [\[lex.pptoken/2\]](https://eel.is/c++draft/lex.pptoken#2). This is conforming because this is preprocessor UB per [\[cpp.concat/3\]](https://eel.is/c++draft/cpp.concat#3).

(At least that's how I interpret it... The former allows it in the grammar production, and declares it ill-formed if the production would be matched. Is a character matching this production "valid" as a preprocessing token, as it is used in the latter? I think it could also be read that this is a valid preprocessing token and can thus be produced fleetingly by pasting but would be ill-formed if written directly.)

There is divergence. The godbolt link shows gcc gives the same error as it gives if one directly writes the UCN in the source.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyUVc1u6zgPfRplQ8SQ6Tg_Cy-aBPlQfJhZteidu5Nl2tatIhmSnLTz9APJSZsEdxYDGA5gHpKH50iM8F51hqhi5ZaV-5kYQ29d9YvathPOUQiz2jaf1c4arxpywHDNym1nm9rqAFqZd1buGa77EAbPiieGB4aHSzyzrmN4-JvhQeb-fy9_LN_-z3ATn-KJ8T3j1_eST48chssXLBpqlSH4AazcxecW_xV9Zbj-YLiBD2BYMCxg5JzzR9hfV9gV_wj4eQUwLD4eg2_X4M_77PSW1vgAshcOfHCTjsCKfcqKfX9cR94-DHtXRQvTweBocFaS9-RBBQgWoij_1oEhsnI3DYwYGyw51GOA0BO87v6MyV-AJQflQRhQ5iS0akB4EDcdrYNg38nAEG2OTXas3Gr6yIYhBRgeMH3c_85yIp0pH51muGW4bZxoA8PDXYECGW4yeOmVj2SkNa11R2U6qEmK0ROES-iO1-v2npQchkxaI0VsUPx3Trf5RREp3R-u9VMATcJHISNm5aG3Z3gGZQK5wVEAFbIsDkIQJyAHQmt7TqYpk_TvnDgehYPB2WaUQVnDcAfCNNCQ1MJNDiut56lCA6pNed94ONtRN1ATHEWQPTUZPEfP4kkQMpCbvkf5kmw3mQwxmcwQH31O8MmOXYxFEh5GHxlMzLUIgRwrDvAcC5v3iJGJi9DeRkKORJPE-TJMwHSqftMoDS1FLD76mD0RpQZaTRSU6fQn1J8wCB_SWRjD9-j3Ap2dCoEMNMqRDPoze7iMLz05inQadSLXkZE0uXS7scD30apOSujUiXwa2osjATln3UWUKaRasIa--iUCl4x4wS6KeTs6SdmsqYpmU2zEjKp8hTzny9UGZ33FF_V6tchFvS5zwfNSFGWZl8gRN-1qLduZqpDjgq_4Ii9xWWCGNXIsFkXebmjJxZItOB2F0pnWp2PcqzPl_UjVZrVa5DMtatI-LXFEQ2dIwbQe9jNXxZx5PXaeLbhWPvjvKkEFTdX87bIU5smweXSCoBVK-7iDZDxmcV4P517JHkQU2Twaft0gfjY6XT38I6jQj3Um7THuBH26_swHZ3-RjLcycY63dZrpVOE_AQAA__83Oxb3">