[lld] [lld][MachO]Multi-threaded i/o. Twice as fast linking a large project. (PR #147134)
John Holdsworth via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 15 08:40:02 PDT 2025
================
@@ -282,11 +284,84 @@ static void saveThinArchiveToRepro(ArchiveFile const *file) {
": Archive::children failed: " + toString(std::move(e)));
}
-static InputFile *addFile(StringRef path, LoadType loadType,
- bool isLazy = false, bool isExplicit = true,
- bool isBundleLoader = false,
- bool isForceHidden = false) {
- std::optional<MemoryBufferRef> buffer = readFile(path);
+class DeferredFile {
+public:
+ StringRef path;
+ bool isLazy;
+ MemoryBufferRef buffer;
+};
+using DeferredFiles = std::vector<DeferredFile>;
+
+// Most input files have been mapped but not yet paged in.
+// This code forces the page-ins on multiple threads so
+// the process is not stalled waiting on disk buffer i/o.
+void multiThreadedPageInBackground(const DeferredFiles &deferred) {
+ static size_t pageSize = Process::getPageSizeEstimate(), totalBytes;
+ static std::mutex mutex;
+ size_t index = 0;
+
+ parallelFor(0, config->readThreads, [&](size_t I) {
+ while (true) {
+ mutex.lock();
+ if (index >= deferred.size()) {
+ mutex.unlock();
+ return;
+ }
+ const StringRef &buff = deferred[index].buffer.getBuffer();
+ totalBytes += buff.size();
+ index += 1;
+ mutex.unlock();
+
+ volatile int t = 0; // Reference each page to load it into memory.
+ for (const char *page = buff.data(), *end = page + buff.size();
+ page < end; page += pageSize)
+ t += *page;
+ }
+ });
+
+ if (getenv("LLD_MULTI_THREAD_PAGE"))
+ llvm::dbgs() << "multiThreadedPageIn " << totalBytes << "/"
+ << deferred.size() << "\n";
+}
+
+static void
+multiThreadedPageIn(const DeferredFiles &deferred = DeferredFiles()) {
+ static std::thread *running;
+ static std::mutex mutex;
+ static std::deque<DeferredFiles *> queue;
+
+ mutex.lock();
+ if (running && (queue.empty() || deferred.empty())) {
+ running->join();
+ delete running;
+ running = nullptr;
+ }
+
+ if (!deferred.empty()) {
+ queue.emplace_back(new DeferredFiles(deferred));
+ if (!running)
+ running = new std::thread([&]() {
+ mutex.lock();
+ while (!queue.empty()) {
+ DeferredFiles *deferred = queue.front();
+ mutex.unlock();
+ multiThreadedPageInBackground(*deferred);
+ delete deferred;
+ mutex.lock();
+ queue.pop_front();
----------------
johnno1962 wrote:
I've reverted to.a while(true) loop as it is less obscure. My previous loop left the pointer to the deferred vector in queue while it was processing which avoided unnecessary reaping and restating of a new thread. I could have just moved the `delete deferred` until the queue had been popped so as not to leave an invalid pointer as that was a valid criticism. But none of this matters. It's a surprisingly subtle function to try to flesh out. My final version was:
```C++
running = new std::thread([&]() {
mutex.lock();
while (!queue.empty()) {
DeferredFiles *deferred = queue.front();
mutex.unlock();
multiThreadedPageInBackground(*deferred);
mutex.lock();
queue.pop_front();
delete deferred;
}
mutex.unlock();
});
https://github.com/llvm/llvm-project/pull/147134
More information about the llvm-commits
mailing list