[clang] Fix a crash with empty escape sequences when lexing (PR #102339)

Aaron Ballman via cfe-commits cfe-commits at lists.llvm.org
Wed Aug 7 10:38:32 PDT 2024


https://github.com/AaronBallman created https://github.com/llvm/llvm-project/pull/102339

The utilities we use for lexing string and character literals can be run in a mode where we pass a null pointer for the diagnostics engine. This mode is used by the format string checkers, for example. However, there were two places that failed to account for a null diagnostic engine pointer: `\o{}` and `\x{}`.

This patch adds a check for a null pointer and correctly handles fallback behavior.

Fixes #102218

>From 6bce9b55346cda0eee5bdb1f713c6042d89628e0 Mon Sep 17 00:00:00 2001
From: Aaron Ballman <aaron at aaronballman.com>
Date: Wed, 7 Aug 2024 13:35:17 -0400
Subject: [PATCH] Fix a crash with empty escape sequences when lexing

The utilities we use for lexing string and character literals can be
run in a mode where we pass a null pointer for the diagnostics engine.
This mode is used by the format string checkers, for example. However,
there were two places that failed to account for a null diagnostic
engine pointer: `\o{}` and `\x{}`.

This patch adds a check for a null pointer and correctly handles
fallback behavior.

Fixes #102218
---
 clang/docs/ReleaseNotes.rst               |  2 ++
 clang/lib/Lex/LiteralSupport.cpp          | 14 ++++++++------
 clang/test/Lexer/char-escapes-delimited.c | 13 +++++++++++++
 3 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 978b4ac8ea2e37..565812e9b503f5 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -170,6 +170,8 @@ Bug Fixes in This Version
   be used in C++.
 - Fixed a failed assertion when checking required literal types in C context. (#GH101304).
 - Fixed a crash when trying to transform a dependent address space type. Fixes #GH101685.
+- Fixed a crash when diagnosing format strings and encountering an empty
+  delimited escape sequence (e.g., ``"\o{}"``). #GH102218
 
 Bug Fixes to Compiler Builtins
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/lib/Lex/LiteralSupport.cpp b/clang/lib/Lex/LiteralSupport.cpp
index 9d2720af5dbd92..225a6c2d15baaa 100644
--- a/clang/lib/Lex/LiteralSupport.cpp
+++ b/clang/lib/Lex/LiteralSupport.cpp
@@ -190,9 +190,10 @@ static unsigned ProcessCharEscape(const char *ThisTokBegin,
       Delimited = true;
       ThisTokBuf++;
       if (*ThisTokBuf == '}') {
-        Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,
-             diag::err_delimited_escape_empty);
-        return ResultChar;
+        HadError = true;
+        if (Diags)
+          Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,
+               diag::err_delimited_escape_empty);
       }
     } else if (ThisTokBuf == ThisTokEnd || !isHexDigit(*ThisTokBuf)) {
       if (Diags)
@@ -283,9 +284,10 @@ static unsigned ProcessCharEscape(const char *ThisTokBegin,
     Delimited = true;
     ++ThisTokBuf;
     if (*ThisTokBuf == '}') {
-      Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,
-           diag::err_delimited_escape_empty);
-      return ResultChar;
+      HadError = true;
+      if (Diags)
+        Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,
+             diag::err_delimited_escape_empty);
     }
 
     while (ThisTokBuf != ThisTokEnd) {
diff --git a/clang/test/Lexer/char-escapes-delimited.c b/clang/test/Lexer/char-escapes-delimited.c
index 5327ef700b0e25..7a8986bc5f8679 100644
--- a/clang/test/Lexer/char-escapes-delimited.c
+++ b/clang/test/Lexer/char-escapes-delimited.c
@@ -123,3 +123,16 @@ void separators(void) {
 static_assert('\N??<DOLLAR SIGN??>' == '$'); // expected-warning 2{{trigraph converted}} \
                                              // ext-warning {{extension}} cxx23-warning {{C++23}}
 #endif
+
+void GH102218(void) {
+  // The format specifier checking code runs the lexer with diagnostics
+  // disabled. This used to crash Clang for malformed \o and \x because the
+  // lexer missed a null pointer check for the diagnostics engine in that case.
+  extern int printf(const char *, ...);
+  printf("\o{}"); // expected-error {{delimited escape sequence cannot be empty}}
+  printf("\x{}"); // expected-error {{delimited escape sequence cannot be empty}}
+
+  // These cases always worked but are here for completeness.
+  printf("\u{}"); // expected-error {{delimited escape sequence cannot be empty}}
+  printf("\N{}"); // expected-error {{delimited escape sequence cannot be empty}}
+}



More information about the cfe-commits mailing list