[clang] 8175509 - [Lex] Don't assert when decoding invalid UCNs.

Sam McCall via cfe-commits cfe-commits at lists.llvm.org
Thu May 5 23:52:01 PDT 2022


Author: Sam McCall
Date: 2022-05-06T08:51:42+02:00
New Revision: 817550919e78ba9bb8336685fe1f40e4f650b2e4

URL: https://github.com/llvm/llvm-project/commit/817550919e78ba9bb8336685fe1f40e4f650b2e4
DIFF: https://github.com/llvm/llvm-project/commit/817550919e78ba9bb8336685fe1f40e4f650b2e4.diff

LOG: [Lex] Don't assert when decoding invalid UCNs.

Currently if a lexically-valid UCN encodes an invalid codepoint, then we
diagnose that, and then hit an assertion while trying to decode it.

Since there isn't anything preventing us reaching this state, remove the
assertion. expandUCNs("X\UAAAAAAAAY") will produce "XY".

Differential Revision: https://reviews.llvm.org/D125059

Added: 
    

Modified: 
    clang/lib/Lex/LiteralSupport.cpp
    clang/test/Lexer/unicode.c

Removed: 
    


################################################################################
diff  --git a/clang/lib/Lex/LiteralSupport.cpp b/clang/lib/Lex/LiteralSupport.cpp
index 6e6fd361ebf94..9a30a41c851d7 100644
--- a/clang/lib/Lex/LiteralSupport.cpp
+++ b/clang/lib/Lex/LiteralSupport.cpp
@@ -320,10 +320,8 @@ static void appendCodePoint(unsigned Codepoint,
                             llvm::SmallVectorImpl<char> &Str) {
   char ResultBuf[4];
   char *ResultPtr = ResultBuf;
-  bool Res = llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr);
-  (void)Res;
-  assert(Res && "Unexpected conversion failure");
-  Str.append(ResultBuf, ResultPtr);
+  if (llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr))
+    Str.append(ResultBuf, ResultPtr);
 }
 
 void clang::expandUCNs(SmallVectorImpl<char> &Buf, StringRef Input) {

diff  --git a/clang/test/Lexer/unicode.c b/clang/test/Lexer/unicode.c
index f67b55415f960..b0cc28cfb915a 100644
--- a/clang/test/Lexer/unicode.c
+++ b/clang/test/Lexer/unicode.c
@@ -28,6 +28,9 @@ CHECK : The preprocessor should not complain about Unicode characters like ©.
 
         int _;
 
+extern int X\UAAAAAAAA; // expected-error {{not allowed in an identifier}}
+int Y = '\UAAAAAAAA'; // expected-error {{invalid universal character}}
+
 #ifdef __cplusplus
 
 extern int ༀ;


        


More information about the cfe-commits mailing list