[llvm] ae0c232 - [Regex] Check two chars in step back optimization (NFC)

Nikita Popov via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 18 06:45:43 PST 2023


Author: Nikita Popov
Date: 2023-01-18T15:45:34+01:00
New Revision: ae0c232e46c8d8b4fc3d0e007094db216d7a82f6

URL: https://github.com/llvm/llvm-project/commit/ae0c232e46c8d8b4fc3d0e007094db216d7a82f6
DIFF: https://github.com/llvm/llvm-project/commit/ae0c232e46c8d8b4fc3d0e007094db216d7a82f6.diff

LOG: [Regex] Check two chars in step back optimization (NFC)

When stepping back, and there is a following fixed character, also
try to check whether another following fixed character matches.

For our tests the next fixed character is often " ", which occurs
pretty frequently, so checking a second character is worthwhile in
practice.

This drops FileCheck runtime for the vloxseg.c test from 25s to
17s for me.

Added: 
    

Modified: 
    llvm/lib/Support/regengine.inc

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Support/regengine.inc b/llvm/lib/Support/regengine.inc
index b32392a86120..681b7145e442 100644
--- a/llvm/lib/Support/regengine.inc
+++ b/llvm/lib/Support/regengine.inc
@@ -317,8 +317,14 @@ step_back(struct re_guts *g, const char *start, const char *stop, sopno startst,
 	/* Find the character that starts the following match. */
 	char ch = OPND(g->strip[startst]);
 	for (; res != start; --res) {
-		if (*res == ch)
-			break;
+		if (*res == ch) {
+			/* Try to check the next fixed character as well. */
+			sopno nextst = startst + 1;
+			const char *next = res + 1;
+			if (nextst >= stopst || OP(g->strip[nextst]) != OCHAR || next >= stop ||
+					*next == (char)OPND(g->strip[nextst]))
+				break;
+    }
 	}
 	return res;
 }


        


More information about the llvm-commits mailing list