[PATCH] D113153: [lld-macho] Cache readFile results
Keith Smiley via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 3 17:17:10 PDT 2021
keith created this revision.
keith added reviewers: int3, smeenai.
Herald added a reviewer: gkm.
Herald added a project: lld-macho.
Herald added a reviewer: lld-macho.
keith requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
In one of our links lld was reading 760k files, but the unique number of
files was only 1500. This takes that link from 30 seconds to 8.
This seems like a heavy hammer, especially since some things don't need
to be cached, like the filelist arguments and the passed static
archives (the latter is already cached as a one off), but it seems ld64
does something similar here to short circuit these duplicate reads:
https://github.com/keith/ld64/blob/82e429e186488529111b0ef86af33a3b1b9438c7/src/ld/InputFiles.cpp#L644-L665
Of the types of files being read for our iOS app, the biggest problem
was constantly re-reading small tbd files:
% wc -l /tmp/read.txt
761414 /tmp/read.txt
% cat /tmp/read.txt | sort -u | wc -l
1503
% cat /tmp/read.txt | grep "\.a$" | wc -l
43721
% cat /tmp/read.txt | grep "\.tbd$" | wc -l
717656
We could likely hoist this logic up to not cache at this level, but it
would be a more invasive change to make sure all callers that needed it
cached the results.
I could see this being an issue with OOMs, and I'm not a linker expert so
maybe there's another way we should solve this problem? Feedback welcome!
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D113153
Files:
lld/MachO/InputFiles.cpp
Index: lld/MachO/InputFiles.cpp
===================================================================
--- lld/MachO/InputFiles.cpp
+++ lld/MachO/InputFiles.cpp
@@ -173,8 +173,14 @@
return true;
}
+static DenseMap<CachedHashStringRef, MemoryBufferRef> resolvedReads;
// Open a given file path and return it as a memory-mapped file.
Optional<MemoryBufferRef> macho::readFile(StringRef path) {
+ CachedHashStringRef key(path);
+ auto entry = resolvedReads.find(key);
+ if (entry != resolvedReads.end())
+ return entry->second;
+
ErrorOr<std::unique_ptr<MemoryBuffer>> mbOrErr = MemoryBuffer::getFile(path);
if (std::error_code ec = mbOrErr.getError()) {
error("cannot open " + path + ": " + ec.message());
@@ -192,7 +198,7 @@
read32be(&hdr->magic) != FAT_MAGIC) {
if (tar)
tar->append(relativeToRoot(path), mbref.getBuffer());
- return mbref;
+ return resolvedReads[key] = mbref;
}
// Object files and archive files may be fat files, which contain multiple
@@ -217,7 +223,8 @@
error(path + ": slice extends beyond end of file");
if (tar)
tar->append(relativeToRoot(path), mbref.getBuffer());
- return MemoryBufferRef(StringRef(buf + offset, size), path.copy(bAlloc));
+ return resolvedReads[key] = MemoryBufferRef(StringRef(buf + offset, size),
+ path.copy(bAlloc));
}
error("unable to find matching architecture in " + path);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D113153.384617.patch
Type: text/x-patch
Size: 1460 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211104/39bc2dc8/attachment.bin>
More information about the llvm-commits
mailing list