[Lldb-commits] [lldb] [lldb] Reduce the frequency of DWARF index progress reporting (PR #118953)

Fri Dec 6 02:49:13 PST 2024

https://github.com/labath created https://github.com/llvm/llvm-project/pull/118953

Indexing a single DWARF unit is a relatively fast operation, particularly if it's a type unit, which can be very small. Reporting progress takes a mutex (and allocates memory, etc.), which creates a lot of contention and slows down indexing noticeably.

This patch reports makes us report progress only once per 10 milliseconds (on average), which speeds up indexing by up to 55%. It achieves this by checking whether the time after indexing every unit. This creates the possibility that a particularly large unit could cause us to stop reporting progress for a while (even for units that have already been indexed), but I don't think this is likely to happen, because:
- Even the largest units don't take that long to index. The largest unit in lldb (4MB of .debug_info) was indexed in "only" 200ms.
- The time is being checked and reported by all worker threads, which means that in order to stall, we'd have to be very unfortunate and pick up an extremely large compile unit on all indexing threads simultaneously.

Even if that does happens, the only negative consequence is some jitteriness in a progress bar, which is why I prefer this over alternative implementations which e.g. involve reporting progress from a dedicated thread.

>From 27c248a5f28f57ec3b0d2e5e191a88330f158e17 Mon Sep 17 00:00:00 2001
From: Pavel Labath <pavel at labath.sk>
Date: Thu, 5 Dec 2024 13:07:13 +0100
Subject: [PATCH] [lldb] Reduce the frequency of DWARF index progress reporting

Indexing a single DWARF unit is a relatively fast operation,
particularly if it's a type unit, which can be very small. Reporting
progress takes a mutex (and allocates memory, etc.), which creates a lot
of contention and slows down indexing noticeably.

This patch reports makes us report progress only once per 10
milliseconds (on average), which speeds up indexing by up to 55%. It
achieves this by checking whether the time after indexing every unit.
This creates the possibility that a particularly large unit could cause
us to stop reporting progress for a while (even for units that have
already been indexed), but I don't think this is likely to happen,
because:
- Even the largest units don't take that long to index. The largest unit
  in lldb (4MB of .debug_info) was indexed in "only" 200ms.
- The time is being checked and reported by all worker threads, which
  means that in order to stall, we'd have to be very unfortunate and
  pick up an extremely large compile unit on all indexing threads
  simultaneously.

Even if that does happens, the only negative consequence is some
jitteriness in a progress bar, which is why I prefer this over
alternative implementations which e.g. involve reporting progress from a
dedicated thread.
---
 .../SymbolFile/DWARF/ManualDWARFIndex.cpp     | 31 ++++++++++++++-----
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
index 5b325e30bef430..a3e595d0194eb9 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
@@ -24,6 +24,7 @@
 #include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/ThreadPool.h"
 #include <atomic>
+#include <chrono>
 #include <optional>
 
 using namespace lldb_private;
@@ -91,14 +92,27 @@ void ManualDWARFIndex::Index() {
   // are available. This is significantly faster than submiting a new task for
   // each unit.
   auto for_each_unit = [&](auto &&fn) {
-    std::atomic<size_t> next_cu_idx = 0;
-    auto wrapper = [&fn, &next_cu_idx, &units_to_index,
-                    &progress](size_t worker_id) {
-      size_t cu_idx;
-      while ((cu_idx = next_cu_idx.fetch_add(1, std::memory_order_relaxed)) <
-             units_to_index.size()) {
-        fn(worker_id, cu_idx, units_to_index[cu_idx]);
-        progress.Increment();
+    std::atomic<size_t> next_unit_idx = 0;
+    std::atomic<size_t> units_indexed = 0;
+    auto wrapper = [&fn, &next_unit_idx, &units_indexed, &units_to_index,
+                    &progress, num_threads](size_t worker_id) {
+      constexpr auto progress_interval = std::chrono::milliseconds(10);
+
+      // Stagger the reports for different threads so we get a steady stream of
+      // one report per ~10ms.
+      auto next_report = std::chrono::steady_clock::now() +
+                         progress_interval * (1 + worker_id);
+      size_t unit_idx;
+      while ((unit_idx = next_unit_idx.fetch_add(
+                  1, std::memory_order_relaxed)) < units_to_index.size()) {
+        fn(worker_id, unit_idx, units_to_index[unit_idx]);
+
+        units_indexed.fetch_add(1, std::memory_order_acq_rel);
+        if (auto now = std::chrono::steady_clock::now(); now >= next_report) {
+          progress.Increment(
+              units_indexed.exchange(0, std::memory_order_acq_rel));
+          next_report = now + num_threads * progress_interval;
+        }
       }
     };
 
@@ -106,6 +120,7 @@ void ManualDWARFIndex::Index() {
       task_group.async(wrapper, i);
 
     task_group.wait();
+    progress.Increment(units_indexed.load(std::memory_order_acquire));
   };
 
   // Extract dies for all DWARFs unit in parallel.  Figure out which units