[clang] 979d0ee - [clang] fix out of bounds access in an empty string when lexing a _Pragma with missing string token

Alex Lorenz via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 2 11:16:16 PST 2022


Author: Alex Lorenz
Date: 2022-02-02T11:16:11-08:00
New Revision: 979d0ee8ab30a175220af3b39a6df7d56de9d2c8

URL: https://github.com/llvm/llvm-project/commit/979d0ee8ab30a175220af3b39a6df7d56de9d2c8
DIFF: https://github.com/llvm/llvm-project/commit/979d0ee8ab30a175220af3b39a6df7d56de9d2c8.diff

LOG: [clang] fix out of bounds access in an empty string when lexing a _Pragma with missing string token

The lexer can attempt to lex a _Pragma and crash with an out of bounds string access when it's
lexing a _Pragma whose string token is an invalid buffer, e.g. when a module header file from which the macro
expansion for that token was deleted from the file system.

Differential Revision: https://reviews.llvm.org/D116052

Added: 
    clang/test/Preprocessor/pragma-missing-string-token.c

Modified: 
    clang/lib/Frontend/PrintPreprocessedOutput.cpp
    clang/lib/Lex/Pragma.cpp

Removed: 
    


################################################################################
diff  --git a/clang/lib/Frontend/PrintPreprocessedOutput.cpp b/clang/lib/Frontend/PrintPreprocessedOutput.cpp
index 1d0022bda474c..e2fc862849ad1 100644
--- a/clang/lib/Frontend/PrintPreprocessedOutput.cpp
+++ b/clang/lib/Frontend/PrintPreprocessedOutput.cpp
@@ -189,7 +189,8 @@ class PrintPPOutputPPCallbacks : public PPCallbacks {
   bool MoveToLine(const Token &Tok, bool RequireStartOfLine) {
     PresumedLoc PLoc = SM.getPresumedLoc(Tok.getLocation());
     unsigned TargetLine = PLoc.isValid() ? PLoc.getLine() : CurLine;
-    bool IsFirstInFile = Tok.isAtStartOfLine() && PLoc.getLine() == 1;
+    bool IsFirstInFile =
+        Tok.isAtStartOfLine() && PLoc.isValid() && PLoc.getLine() == 1;
     return MoveToLine(TargetLine, RequireStartOfLine) || IsFirstInFile;
   }
 

diff  --git a/clang/lib/Lex/Pragma.cpp b/clang/lib/Lex/Pragma.cpp
index eb7e7cbc47140..eb370e8a0ecd6 100644
--- a/clang/lib/Lex/Pragma.cpp
+++ b/clang/lib/Lex/Pragma.cpp
@@ -263,7 +263,12 @@ void Preprocessor::Handle_Pragma(Token &Tok) {
   }
 
   SourceLocation RParenLoc = Tok.getLocation();
-  std::string StrVal = getSpelling(StrTok);
+  bool Invalid = false;
+  std::string StrVal = getSpelling(StrTok, &Invalid);
+  if (Invalid) {
+    Diag(PragmaLoc, diag::err__Pragma_malformed);
+    return;
+  }
 
   // The _Pragma is lexically sound.  Destringize according to C11 6.10.9.1:
   // "The string literal is destringized by deleting any encoding prefix,

diff  --git a/clang/test/Preprocessor/pragma-missing-string-token.c b/clang/test/Preprocessor/pragma-missing-string-token.c
new file mode 100644
index 0000000000000..5f40b2f4fdb97
--- /dev/null
+++ b/clang/test/Preprocessor/pragma-missing-string-token.c
@@ -0,0 +1,27 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -emit-module -x c -fmodules -I %t/Inputs -fmodule-name=aa %t/Inputs/module.modulemap -o %t/aa.pcm
+// RUN: rm %t/Inputs/b.h
+// RUN: not %clang_cc1 -E -fmodules -I %t/Inputs -fmodule-file=%t/aa.pcm %s -o - -fallow-pcm-with-compiler-errors 2>&1 | FileCheck %s
+
+//--- Inputs/module.modulemap
+module aa {
+    header "a.h"
+    header "b.h"
+}
+
+//--- Inputs/a.h
+#define TEST(x) x
+
+//--- Inputs/b.h
+#define SUB "mypragma"
+
+//--- test.c
+#include "a.h"
+
+_Pragma(SUB);
+int a = TEST(SUB);
+
+// CHECK: int a
+// CHECK: 1 error generated


        


More information about the cfe-commits mailing list