[llvm] [FIX] Fix undefined-behaviour in regex engine. (PR #73071)

via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 30 19:38:02 PST 2023


https://github.com/tanmaysachan updated https://github.com/llvm/llvm-project/pull/73071

>From c983585f2e2c6369d9e2e818df36284024be4bd7 Mon Sep 17 00:00:00 2001
From: tanmaysachan <tnmysachan at gmail.com>
Date: Wed, 22 Nov 2023 08:09:08 +0530
Subject: [PATCH] Fix undefined-behaviour in regex engine.

- Running the regex engine on an empty string causes "Applying non-zero offset to null pointer" UB.
- Bug discovered through "mlir-text-parser-fuzzer" module.
- This patch puts a check in the matcher and adds a corresponding test.
---
 llvm/lib/Support/Regex.cpp           | 4 ++++
 llvm/unittests/Support/RegexTest.cpp | 7 +++++++
 2 files changed, 11 insertions(+)

diff --git a/llvm/lib/Support/Regex.cpp b/llvm/lib/Support/Regex.cpp
index 8fa71a749cc8e10..5eedf95c48e3784 100644
--- a/llvm/lib/Support/Regex.cpp
+++ b/llvm/lib/Support/Regex.cpp
@@ -92,6 +92,10 @@ bool Regex::match(StringRef String, SmallVectorImpl<StringRef> *Matches,
 
   unsigned nmatch = Matches ? preg->re_nsub+1 : 0;
 
+  // Update null string to empty string.
+  if (String.data() == nullptr)
+    String = "";
+
   // pmatch needs to have at least one element.
   SmallVector<llvm_regmatch_t, 8> pm;
   pm.resize(nmatch > 0 ? nmatch : 1);
diff --git a/llvm/unittests/Support/RegexTest.cpp b/llvm/unittests/Support/RegexTest.cpp
index e3c721b466c6ccd..09f674bb209c079 100644
--- a/llvm/unittests/Support/RegexTest.cpp
+++ b/llvm/unittests/Support/RegexTest.cpp
@@ -225,3 +225,10 @@ TEST_F(RegexTest, OssFuzz3727Regression) {
 }
 
 }
+
+TEST_F(RegexTest, NullStringInput) {
+  Regex r("^$");
+  // String data points to nullptr in default constructor
+  StringRef String;
+  EXPECT_TRUE(r.match(String));
+}



More information about the llvm-commits mailing list