[PATCH] D125059: [Lex] Don't assert when decoding invalid UCNs.
Sam McCall via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu May 5 16:45:01 PDT 2022
sammccall created this revision.
sammccall added a reviewer: hokein.
Herald added a project: All.
sammccall requested review of this revision.
Herald added projects: clang, clang-tools-extra.
Herald added a subscriber: cfe-commits.
Currently if a lexically-valid UCN encodes an invalid codepoint, then we
diagnose that, and then hit an assertion while trying to decode it.
Since there isn't anything preventing us reaching this state, remove the
assertion. expandUCNs("X\UAAAAAAAAY") will produce "XY".
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D125059
Files:
clang-tools-extra/pseudo/test/crash/bad-ucn.c
clang/lib/Lex/LiteralSupport.cpp
clang/test/Lexer/unicode.c
Index: clang/test/Lexer/unicode.c
===================================================================
--- clang/test/Lexer/unicode.c
+++ clang/test/Lexer/unicode.c
@@ -28,6 +28,9 @@
int _;
+extern int X\UAAAAAAAA; // expected-error {{not allowed in an identifier}}
+int Y = '\UAAAAAAAA'; // expected-error {{invalid universal character}}
+
#ifdef __cplusplus
extern int ༀ;
Index: clang/lib/Lex/LiteralSupport.cpp
===================================================================
--- clang/lib/Lex/LiteralSupport.cpp
+++ clang/lib/Lex/LiteralSupport.cpp
@@ -320,10 +320,8 @@
llvm::SmallVectorImpl<char> &Str) {
char ResultBuf[4];
char *ResultPtr = ResultBuf;
- bool Res = llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr);
- (void)Res;
- assert(Res && "Unexpected conversion failure");
- Str.append(ResultBuf, ResultPtr);
+ if (llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr))
+ Str.append(ResultBuf, ResultPtr);
}
void clang::expandUCNs(SmallVectorImpl<char> &Buf, StringRef Input) {
Index: clang-tools-extra/pseudo/test/crash/bad-ucn.c
===================================================================
--- /dev/null
+++ clang-tools-extra/pseudo/test/crash/bad-ucn.c
@@ -0,0 +1,4 @@
+// This UCN doesn't encode a valid codepoint.
+// We used to assert while trying to expand UCNs in the token.
+// RUN: clang-pseudo -source=%s
+A\UAAAAAAAA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D125059.427494.patch
Type: text/x-patch
Size: 1414 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20220505/6aed53b9/attachment.bin>
More information about the cfe-commits
mailing list