[PATCH] D125059: [Lex] Don't assert when decoding invalid UCNs.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu May 5 16:45:01 PDT 2022


sammccall created this revision.
sammccall added a reviewer: hokein.
Herald added a project: All.
sammccall requested review of this revision.
Herald added projects: clang, clang-tools-extra.
Herald added a subscriber: cfe-commits.

Currently if a lexically-valid UCN encodes an invalid codepoint, then we
diagnose that, and then hit an assertion while trying to decode it.

Since there isn't anything preventing us reaching this state, remove the
assertion. expandUCNs("X\UAAAAAAAAY") will produce "XY".


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D125059

Files:
  clang-tools-extra/pseudo/test/crash/bad-ucn.c
  clang/lib/Lex/LiteralSupport.cpp
  clang/test/Lexer/unicode.c


Index: clang/test/Lexer/unicode.c
===================================================================
--- clang/test/Lexer/unicode.c
+++ clang/test/Lexer/unicode.c
@@ -28,6 +28,9 @@
 
         int _;
 
+extern int X\UAAAAAAAA; // expected-error {{not allowed in an identifier}}
+int Y = '\UAAAAAAAA'; // expected-error {{invalid universal character}}
+
 #ifdef __cplusplus
 
 extern int ༀ;
Index: clang/lib/Lex/LiteralSupport.cpp
===================================================================
--- clang/lib/Lex/LiteralSupport.cpp
+++ clang/lib/Lex/LiteralSupport.cpp
@@ -320,10 +320,8 @@
                             llvm::SmallVectorImpl<char> &Str) {
   char ResultBuf[4];
   char *ResultPtr = ResultBuf;
-  bool Res = llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr);
-  (void)Res;
-  assert(Res && "Unexpected conversion failure");
-  Str.append(ResultBuf, ResultPtr);
+  if (llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr))
+    Str.append(ResultBuf, ResultPtr);
 }
 
 void clang::expandUCNs(SmallVectorImpl<char> &Buf, StringRef Input) {
Index: clang-tools-extra/pseudo/test/crash/bad-ucn.c
===================================================================
--- /dev/null
+++ clang-tools-extra/pseudo/test/crash/bad-ucn.c
@@ -0,0 +1,4 @@
+// This UCN doesn't encode a valid codepoint.
+// We used to assert while trying to expand UCNs in the token.
+// RUN: clang-pseudo -source=%s
+A\UAAAAAAAA


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D125059.427494.patch
Type: text/x-patch
Size: 1414 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20220505/6aed53b9/attachment.bin>


More information about the cfe-commits mailing list