[Lldb-commits] [lldb] a8abb69 - [lldb] Parallelize fetching symbol files in crashlog.py
Jonas Devlieghere via lldb-commits
lldb-commits at lists.llvm.org
Fri May 13 12:25:45 PDT 2022
Author: Jonas Devlieghere
Date: 2022-05-13T12:25:41-07:00
New Revision: a8abb695859ad4e7fe695b9ee238a2b0cd00af7c
URL: https://github.com/llvm/llvm-project/commit/a8abb695859ad4e7fe695b9ee238a2b0cd00af7c
DIFF: https://github.com/llvm/llvm-project/commit/a8abb695859ad4e7fe695b9ee238a2b0cd00af7c.diff
LOG: [lldb] Parallelize fetching symbol files in crashlog.py
When using dsymForUUID, the majority of time symbolication a crashlog
with crashlog.py is spent waiting for it to complete. Currently, we're
calling dsymForUUID sequentially when iterating over the modules. We can
drastically cut down this time by calling dsymForUUID in parallel. This
patch uses Python's ThreadPoolExecutor (introduced in Python 3.2) to
parallelize this IO-bound operation.
The performance improvement is hard to benchmark, because even with an
empty local cache, consecutive calls to dsymForUUID for the same UUID
complete faster. With warm caches, I'm seeing a ~30% performance
improvement (~90s -> ~60s). I suspect the gains will be much bigger for
a cold cache.
dsymForUUID supports batching up multiple UUIDs. I considered going that
route, but that would require more intrusive changes. It would require
hoisting the logic out of locate_module_and_debug_symbols which we
explicitly document [1] as a feature of Symbolication.py to locate
symbol files.
[1] https://lldb.llvm.org/use/symbolication.html
Differential reviison: https://reviews.llvm.org/D125107
Added:
Modified:
lldb/examples/python/crashlog.py
Removed:
################################################################################
diff --git a/lldb/examples/python/crashlog.py b/lldb/examples/python/crashlog.py
index bc34bf75f8b19..0bcdcd0a9fe68 100755
--- a/lldb/examples/python/crashlog.py
+++ b/lldb/examples/python/crashlog.py
@@ -26,8 +26,8 @@
# PYTHONPATH=/path/to/LLDB.framework/Resources/Python ./crashlog.py ~/Library/Logs/DiagnosticReports/a.crash
#----------------------------------------------------------------------
-from __future__ import print_function
import cmd
+import concurrent.futures
import contextlib
import datetime
import glob
@@ -41,9 +41,13 @@
import string
import subprocess
import sys
+import threading
import time
import uuid
+
+print_lock = threading.RLock()
+
try:
# First try for LLDB in case PYTHONPATH is already correctly setup.
import lldb
@@ -269,7 +273,8 @@ def locate_module_and_debug_symbols(self):
self.resolved = True
uuid_str = self.get_normalized_uuid_string()
if self.show_symbol_progress():
- print('Getting symbols for %s %s...\n' % (uuid_str, self.path), end=' ')
+ with print_lock:
+ print('Getting symbols for %s %s...' % (uuid_str, self.path))
if os.path.exists(self.dsymForUUIDBinary):
dsym_for_uuid_command = '%s %s' % (
self.dsymForUUIDBinary, uuid_str)
@@ -278,7 +283,8 @@ def locate_module_and_debug_symbols(self):
try:
plist_root = read_plist(s)
except:
- print(("Got exception: ", sys.exc_info()[1], " handling dsymForUUID output: \n", s))
+ with print_lock:
+ print(("Got exception: ", sys.exc_info()[1], " handling dsymForUUID output: \n", s))
raise
if plist_root:
plist = plist_root[uuid_str]
@@ -306,7 +312,8 @@ def locate_module_and_debug_symbols(self):
if not os.path.exists(dwarf_dir):
# Not a dSYM bundle, probably an Xcode archive.
continue
- print('falling back to binary inside "%s"' % dsym)
+ with print_lock:
+ print('falling back to binary inside "%s"' % dsym)
self.symfile = dsym
for filename in os.listdir(dwarf_dir):
self.path = os.path.join(dwarf_dir, filename)
@@ -319,7 +326,8 @@ def locate_module_and_debug_symbols(self):
pass
if (self.resolved_path and os.path.exists(self.resolved_path)) or (
self.path and os.path.exists(self.path)):
- print('Resolved symbols for %s %s...\n' % (uuid_str, self.path), end=' ')
+ with print_lock:
+ print('Resolved symbols for %s %s...' % (uuid_str, self.path))
return True
else:
self.unavailable = True
@@ -978,9 +986,16 @@ def SymbolicateCrashLog(crash_log, options):
else:
print('error: can\'t find image for identifier "%s"' % ident)
- for image in images_to_load:
- if image not in loaded_images:
- err = image.add_module(target)
+ futures = []
+ with concurrent.futures.ThreadPoolExecutor() as executor:
+ def add_module(image, target):
+ return image, image.add_module(target)
+
+ for image in images_to_load:
+ futures.append(executor.submit(add_module, image=image, target=target))
+
+ for future in concurrent.futures.as_completed(futures):
+ image, err = future.result()
if err:
print(err)
else:
More information about the lldb-commits
mailing list