[lld] [lld] check cache before real_path in loadDylib (PR #140791)
Richard Howell via llvm-commits
llvm-commits at lists.llvm.org
Tue May 20 12:56:07 PDT 2025
https://github.com/rmaz created https://github.com/llvm/llvm-project/pull/140791
In #137649 symlink resolution was added when loading dylibs. This introduced a performance regression when linking with a large number of inputs with LC_LOAD_DYLIB commands due to the syscall overhead of realpath.
Refactor the change to be closer to the original:
- first check if the given path is in the cache
- if not, resolve it and check again
- update cache entries of both paths to point to the same dylib
This mitigates the regression as we do not incur the realpath cost for every loadDylib call, only once per unique path.
>From 1a50d046e4ff0e5e9210cabec6d00bd309af960b Mon Sep 17 00:00:00 2001
From: Richard Howell <rhow at meta.com>
Date: Tue, 20 May 2025 12:39:09 -0700
Subject: [PATCH] [lld] check cache before real_path in loadDylib
In #137649 symlink resolution was added when loading dylibs. This
introduced a performance regression when linking with a large
number of inputs with LC_LOAD_DYLIB commands due to the syscall
overhead of realpath.
Refactor the change to be closer to the original:
- first check if the given path is in the cache
- if not, resolve it and check again
- update cache entries of both paths to point to the same dylib
This mitigates the regression as we do not incur the realpath cost
for every loadDylib call, only once per unique path.
---
lld/MachO/DriverUtils.cpp | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/lld/MachO/DriverUtils.cpp b/lld/MachO/DriverUtils.cpp
index cf874018fa34b..14d60eb4cfa81 100644
--- a/lld/MachO/DriverUtils.cpp
+++ b/lld/MachO/DriverUtils.cpp
@@ -229,12 +229,7 @@ static DenseMap<CachedHashStringRef, DylibFile *> loadedDylibs;
DylibFile *macho::loadDylib(MemoryBufferRef mbref, DylibFile *umbrella,
bool isBundleLoader, bool explicitlyLinked) {
- // Frameworks can be found from different symlink paths, so resolve
- // symlinks before looking up in the dylib cache.
- SmallString<128> realPath;
- std::error_code err = fs::real_path(mbref.getBufferIdentifier(), realPath);
- CachedHashStringRef path(!err ? uniqueSaver().save(StringRef(realPath))
- : mbref.getBufferIdentifier());
+ CachedHashStringRef path(mbref.getBufferIdentifier());
DylibFile *&file = loadedDylibs[path];
if (file) {
if (explicitlyLinked)
@@ -242,6 +237,23 @@ DylibFile *macho::loadDylib(MemoryBufferRef mbref, DylibFile *umbrella,
return file;
}
+ // Frameworks can be found from different symlink paths, so resolve
+ // symlinks and look up in the dylib cache.
+ DylibFile *&realfile = file;
+ SmallString<128> realPath;
+ std::error_code err = fs::real_path(mbref.getBufferIdentifier(), realPath);
+ if (!err) {
+ CachedHashStringRef resolvedPath(uniqueSaver().save(StringRef(realPath)));
+ realfile = loadedDylibs[resolvedPath];
+ if (realfile) {
+ if (explicitlyLinked)
+ realfile->setExplicitlyLinked();
+
+ file = realfile;
+ return realfile;
+ }
+ }
+
DylibFile *newFile;
file_magic magic = identify_magic(mbref.getBuffer());
if (magic == file_magic::tapi_file) {
@@ -253,6 +265,7 @@ DylibFile *macho::loadDylib(MemoryBufferRef mbref, DylibFile *umbrella,
}
file =
make<DylibFile>(**result, umbrella, isBundleLoader, explicitlyLinked);
+ realfile = file;
// parseReexports() can recursively call loadDylib(). That's fine since
// we wrote the DylibFile we just loaded to the loadDylib cache via the
@@ -268,6 +281,7 @@ DylibFile *macho::loadDylib(MemoryBufferRef mbref, DylibFile *umbrella,
magic == file_magic::macho_executable ||
magic == file_magic::macho_bundle);
file = make<DylibFile>(mbref, umbrella, isBundleLoader, explicitlyLinked);
+ realfile = file;
// parseLoadCommands() can also recursively call loadDylib(). See comment
// in previous block for why this means we must copy `file` here.
More information about the llvm-commits
mailing list