[llvm] f654493 - Make check-llvm run 50% faster on macOS, 18% faster on Windows.

Nico Weber via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 6 09:57:55 PST 2020


Author: Nico Weber
Date: 2020-01-06T12:57:42-05:00
New Revision: f6544934b94932f1d2231238046f83ba8b083040

URL: https://github.com/llvm/llvm-project/commit/f6544934b94932f1d2231238046f83ba8b083040
DIFF: https://github.com/llvm/llvm-project/commit/f6544934b94932f1d2231238046f83ba8b083040.diff

LOG: Make check-llvm run 50% faster on macOS, 18% faster on Windows.

While looking at cycle time graphs of some of my bots, I noticed
that 327894859cc made check-llvm noticeably slower on macOS and
Windows.

As it turns out, the 5 substitutions added in that change were
enough to cause lit to thrash the build-in cache in re.compile()
(re.sub() is implemented as re.compile().sub()), and apparently
applySubstitutions() is on the cricital path and slow when all
regexes need to compile all the time.

(See `_MAXCACHE = 512` in cpython/Lib/re.py)

Supporting full regexes for lit substitutions seems a bit like
overkill, but for now add a simple unbounded cache to recover
the lost performance.

No intended behavior change.

Added: 
    

Modified: 
    llvm/utils/lit/lit/TestRunner.py

Removed: 
    


################################################################################
diff  --git a/llvm/utils/lit/lit/TestRunner.py b/llvm/utils/lit/lit/TestRunner.py
index d572af273de4..96411c98ee33 100644
--- a/llvm/utils/lit/lit/TestRunner.py
+++ b/llvm/utils/lit/lit/TestRunner.py
@@ -1112,11 +1112,16 @@ def regex_escape(s):
         s = s.replace('&', '\&')
         return s
     substitutions.extend([
-            ('%{/s:regex_replacement}', regex_escape(sourcepath.replace('\\', '/'))),
-            ('%{/S:regex_replacement}', regex_escape(sourcedir.replace('\\', '/'))),
-            ('%{/p:regex_replacement}', regex_escape(sourcedir.replace('\\', '/'))),
-            ('%{/t:regex_replacement}', regex_escape(tmpBase.replace('\\', '/')) + '.tmp'),
-            ('%{/T:regex_replacement}', regex_escape(tmpDir.replace('\\', '/'))),
+            ('%{/s:regex_replacement}',
+             regex_escape(sourcepath.replace('\\', '/'))),
+            ('%{/S:regex_replacement}',
+             regex_escape(sourcedir.replace('\\', '/'))),
+            ('%{/p:regex_replacement}',
+             regex_escape(sourcedir.replace('\\', '/'))),
+            ('%{/t:regex_replacement}',
+             regex_escape(tmpBase.replace('\\', '/')) + '.tmp'),
+            ('%{/T:regex_replacement}',
+             regex_escape(tmpDir.replace('\\', '/'))),
             ])
 
     # "%:[STpst]" are normalized paths without colons and without a leading
@@ -1130,6 +1135,18 @@ def regex_escape(s):
             ])
     return substitutions
 
+def _memoize(f):
+    cache = {}  # Intentionally unbounded, see applySubstitutions()
+    def memoized(x):
+        if x not in cache:
+            cache[x] = f(x)
+        return cache[x]
+    return memoized
+
+ at _memoize
+def _caching_re_compile(r):
+    return re.compile(r)
+
 def applySubstitutions(script, substitutions):
     """Apply substitutions to the script.  Allow full regular expression syntax.
     Replace each matching occurrence of regular expression pattern a with
@@ -1139,7 +1156,14 @@ def processLine(ln):
         for a,b in substitutions:
             if kIsWindows:
                 b = b.replace("\\","\\\\")
-            ln = re.sub(a, b, ln)
+            # re.compile() has a built-in LRU cache with 512 entries. In some
+            # test suites lit ends up thrashing that cache, which made e.g.
+            # check-llvm run 50% slower.  Use an explicit, unbounded cache
+            # to prevent that from happening.  Since lit is fairly
+            # short-lived, since the set of substitutions is fairly small, and
+            # since thrashing has such bad consequences, not bounding the cache
+            # seems reasonable.
+            ln = _caching_re_compile(a).sub(b, ln)
 
         # Strip the trailing newline and any extra whitespace.
         return ln.strip()


        


More information about the llvm-commits mailing list