[lld] [lld][MachO]Multi-threaded i/o. Twice as fast linking a large project. (PR #147134)

Mon Sep 8 09:13:37 PDT 2025

================
@@ -282,11 +284,117 @@ static void saveThinArchiveToRepro(ArchiveFile const *file) {
           ": Archive::children failed: " + toString(std::move(e)));
 }
 
-static InputFile *addFile(StringRef path, LoadType loadType,
-                          bool isLazy = false, bool isExplicit = true,
-                          bool isBundleLoader = false,
-                          bool isForceHidden = false) {
-  std::optional<MemoryBufferRef> buffer = readFile(path);
+struct DeferredFile {
+  StringRef path;
+  bool isLazy;
+  MemoryBufferRef buffer;
+};
+using DeferredFiles = std::vector<DeferredFile>;
+
+class SerialBackgroundQueue {
+  std::deque<std::function<void()>> queue;
+  std::thread *running;
+  std::mutex mutex;
+
+public:
+  void queueWork(std::function<void()> work) {
+    mutex.lock();
+    if (running && queue.empty()) {
+      mutex.unlock();
+      running->join();
+      mutex.lock();
+      delete running;
+      running = nullptr;
+    }
+
+    if (work) {
+      queue.emplace_back(std::move(work));
+      if (!running)
+        running = new std::thread([&]() {
+          while (true) {
+            mutex.lock();
+            if (queue.empty()) {
+              mutex.unlock();
+              break;
+            }
+            auto work = std::move(queue.front());
+            mutex.unlock();
+            work();
+            mutex.lock();
+            queue.pop_front();
+            mutex.unlock();
+          }
+        });
+    }
+    mutex.unlock();
+  }
+};
+
+// Most input files have been mapped but not yet paged in.
+// This code forces the page-ins on multiple threads so
+// the process is not stalled waiting on disk buffer i/o.
+void multiThreadedPageInBackground(DeferredFiles &deferred) {
----------------
ellishg wrote:

I actually was just testing `madvise(..., MADV_WILLNEED)` on macOS and I found the performance to be about the same, but I believe it makes the code's intention much more clear.

I was thinking there could be a new function like `MemoryBuffer::dontNeedIfMmap()`, maybe called `MemoryBuffer::prefetch()`, that will call `madvise()` or `PrefetchVirtualMemory` for you.

https://llvm.org/doxygen/classllvm_1_1MemoryBuffer.html#a84540ead6f0846d050a11c007f892f00

https://github.com/llvm/llvm-project/pull/147134