[PATCH] D72273: Make check-llvm run 50% faster on macOS, 18% faster on Windows.

Nico Weber via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 6 08:00:16 PST 2020


thakis created this revision.
thakis added a reviewer: rnk.
Herald added a subscriber: delcypher.
Herald added a project: LLVM.
thakis marked an inline comment as done.
thakis added inline comments.


================
Comment at: llvm/utils/lit/lit/TestRunner.py:1124
+            ('%{/T:regex_replacement}',
+             regex_escape(tmpDir.replace('\\', '/'))),
             ])
----------------
(This is just a reflowing, for PEP-8 complicance.)


While looking at cycle time graphs of some of my bots, I noticed
that 327894859cc <https://reviews.llvm.org/rG327894859cc41c1730807f8a179aa880203262f5> made check-llvm noticeably slower on macOS and
Windows.

As it turns out, the 5 substitutions added in that change were
enough to cause lit to thrash the build-in cache in re.compile()
(re.sub() is implemented as re.compile().sub()), and apparently
applySubstitutions() is on the cricital path and slow when all
regexes need to compile all the time.

(See `_MAXCACHE = 512` in cpython/Lib/re.py)

Supporting full regexes for lit substitutions seems a bit like
overkill, but for now add a simple unbounded cache to recover
the lost performance.

No intended behavior change.


https://reviews.llvm.org/D72273

Files:
  llvm/utils/lit/lit/TestRunner.py


Index: llvm/utils/lit/lit/TestRunner.py
===================================================================
--- llvm/utils/lit/lit/TestRunner.py
+++ llvm/utils/lit/lit/TestRunner.py
@@ -1112,11 +1112,16 @@
         s = s.replace('&', '\&')
         return s
     substitutions.extend([
-            ('%{/s:regex_replacement}', regex_escape(sourcepath.replace('\\', '/'))),
-            ('%{/S:regex_replacement}', regex_escape(sourcedir.replace('\\', '/'))),
-            ('%{/p:regex_replacement}', regex_escape(sourcedir.replace('\\', '/'))),
-            ('%{/t:regex_replacement}', regex_escape(tmpBase.replace('\\', '/')) + '.tmp'),
-            ('%{/T:regex_replacement}', regex_escape(tmpDir.replace('\\', '/'))),
+            ('%{/s:regex_replacement}',
+             regex_escape(sourcepath.replace('\\', '/'))),
+            ('%{/S:regex_replacement}',
+             regex_escape(sourcedir.replace('\\', '/'))),
+            ('%{/p:regex_replacement}',
+             regex_escape(sourcedir.replace('\\', '/'))),
+            ('%{/t:regex_replacement}',
+             regex_escape(tmpBase.replace('\\', '/')) + '.tmp'),
+            ('%{/T:regex_replacement}',
+             regex_escape(tmpDir.replace('\\', '/'))),
             ])
 
     # "%:[STpst]" are normalized paths without colons and without a leading
@@ -1130,6 +1135,18 @@
             ])
     return substitutions
 
+def _memoize(f):
+    cache = {}  # Intentionally unbounded, see applySubstitutions()
+    def memoized(x):
+        if x not in cache:
+            cache[x] = f(x)
+        return cache[x]
+    return memoized
+
+ at _memoize
+def _caching_re_compile(r):
+    return re.compile(r)
+
 def applySubstitutions(script, substitutions):
     """Apply substitutions to the script.  Allow full regular expression syntax.
     Replace each matching occurrence of regular expression pattern a with
@@ -1139,7 +1156,14 @@
         for a,b in substitutions:
             if kIsWindows:
                 b = b.replace("\\","\\\\")
-            ln = re.sub(a, b, ln)
+            # re.compile() has a built-in LRU cache with 512 entries. In some
+            # test suites lit ends up thrashing that cache, which made e.g.
+            # check-llvm run 50% slower.  Use an explicit, unbounded cache
+            # to prevent that from happening.  Since lit is fairly
+            # short-lived, since the set of substitutions is fairly small, and
+            # since thrashing has such bad consequences, not bounding the cache
+            # seems reasonable.
+            ln = _caching_re_compile(a).sub(b, ln)
 
         # Strip the trailing newline and any extra whitespace.
         return ln.strip()


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D72273.236366.patch
Type: text/x-patch
Size: 2679 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200106/c261a0c2/attachment.bin>


More information about the llvm-commits mailing list