<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/64161>64161</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Assertion in lexer for Unicode bidirectional text support
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
r0ller
</td>
</tr>
</table>
<pre>
When llvm and clang are built with debug build type, the front end crashes on trying to compile this one liner with clang:
int A\N{LEFT-TO-RIGHT OVERRIDE};
I attached the stack dump (crash.txt) but for whatever reasons the preprocessed sources don't get generated.
I debugged the issue as well and checked the source file LiteralSupport.cpp as the function expandUCNs() where the assertion fails resides there. There's a function called ProcessNamedUCNEscape() which seems to be handling such named escapes like the one that triggers the crash. That function calls after some checks nameToCodepointStrict() whereas before the failing assert nameToCodepointLooseMatching() is called but it returns nullopt. That gave me the idea to call nameToCodepointStrict() before nameToCodepointLooseMatching() is called and only call the latter if the former doesn't return anything. The order may be the other way around as well but judging by their names calling the stricter seemed logical. Actually, that seems to solve the problem at least the codepoint returned (202D) matches what is listed in the [UCN list](https://www.unicode.org/Public/15.0.0/ucd/NamesList.txt).
Does this solution seem to make sense? Shall I go on and submit a patch?
[crash.txt](https://github.com/llvm/llvm-project/files/12185078/crash.txt)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyUVU1v2zgT_jX0ZVBDouPIOfiQxPH7BsimRevunkfiSGJDkQI5iuN_vxhKbdrFYoEeEgO2Zub5mHmEKdnOE-3V9k5tDyucuA9xHwvnKK7qYC77v3ry4NzrAOgNNA59BxgJ6sk6hrPlHgzVU5e_MMCXkZS-B-4J2hg8A0lZxNRTguCB48X6DjhAE4bROgLurfxC4KynOLfMc9TmVhUHVSz_rWe4Vdv7Z1XdPT0cTx9OHz98fvzf_0_w8c-Hz58fDw-qOqjN3c81j4DM2PRkMqLE2LyAmYYRlN5lVGt-Y6VvoJ4Y2hDh3CPTK0WIhCn4lOvGSGMMDaVEBlKYYkMJTPBKVwwdyZ-niExm_ev0LE23TLcpTQSY4EzOzXL21Lx8x5bbQiuSPFmmiO7LNI4h8roZRynLmk6-YRs80NuI3ny9f05K74TAuadI-RlMiWJ-qEXrEkRK1lCuj7SGk3woXSXA93YNOkcGPs0sn3Eg6f2QGhQ_lwG26SERDUnsqwl69MaJm2lqevBSBJRLEjj7MqMRZ7lHBo626yjOPGbt4SQ__AIiAbZMEVIYaNYn5c6ncB8MjcF6_sLRNvwzbUxQUxsW_sJaUM06_LP6KYREfyA3vfXd0sSm7wLIGliGSDxFn8BPzoWRF6AdvhIM8xBrCPMWo3P_CXAB9lsoZDeCd5e5u4xzyKKKbWeGIQ4UwQRK8w7OeAH9haVhNhlCNBRhwIt4la2QDYAzXgBjmLz5sYvC-ttkOlGtvsizNmbEM6J8sPl8hJeYQyReu9DZBt0abhue0LnLfPnI71uSgnul5YZC7WgAZHCEiec1-K7HwoCMHKYu9EEEGUQfSvkmRR1nE5MB63Ot2t59vX_OX6rtQeldzzwmCQ19VPp4Pp_Xk7cyYR1ip_Tx01Q72yh9LLfrYl0ofZwao_RRtj092cRLFvxyw4eQL8cmoTLlNRVywm3AF4JEPpHaHOFLL149Qhck5sTBNNWDZUAYhYfaHJe-27v35PkX4J3lfqrXTRiUPkryLh8fxhi-kezVUUIiCRFd7rZFtVP6-HOYzXNWZr8xN5sbXNG-vL4prqpyo4tVvzc3uNuV1bYk1Hhdt2W11XrXVqirFtuyXdm9LvSmqHRVbrUuijUZLFFXN2g25fZKG3VV0IDWrQWXqLvK2ba_viqvy5XDmlzK7xStPZ3n4FNayysm7jOXeuqSuirEvPTehS072t_-yC_rwdEbxZzMX2czobbGRsqJgQ6Y3hjSHJSrKbr9b4uZ0YmaGf3fAQAA__86CGrA">