[llvm] [llvm-nm] Improve performance while faking symbols from function starts (PR #162755)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 9 17:34:23 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-llvm-binary-utilities
Author: Daniel RodrÃguez Troitiño (drodriguez)
<details>
<summary>Changes</summary>
By default `nm` will look into `LC_FUNCTION_STARTS` for binaries that have the flag `MH_NLIST_OUTOFSYNC_WITH_DYLDINFO` set unless `--no-dyldinfo` flag is passed.
The implementation that looked for those `LC_FUNCTION_STARTS` in the symbol list was a double nested loop that checked the symbol list over and over again for each of the `LC_FUNCTION_STARTS` entries. For binaries with couple million function starts and hundreds of thousands of symbols, the double nested loop doesn't seem to finish and takes hours even in powerful machines.
Instead of the nested loop, exchange time for memory and add all the addresses of the symbols into a set that can be checked then for each of the `LC_FUNCTION_STARTS` very quickly. What took hours and hours and did not seem to finish now takes less than 10 seconds.
Fixes #<!-- -->93944
---
Full diff: https://github.com/llvm/llvm-project/pull/162755.diff
1 Files Affected:
- (modified) llvm/tools/llvm-nm/llvm-nm.cpp (+5-5)
``````````diff
diff --git a/llvm/tools/llvm-nm/llvm-nm.cpp b/llvm/tools/llvm-nm/llvm-nm.cpp
index ff07fbbaa5351..1a0d045d8daa3 100644
--- a/llvm/tools/llvm-nm/llvm-nm.cpp
+++ b/llvm/tools/llvm-nm/llvm-nm.cpp
@@ -16,6 +16,7 @@
//===----------------------------------------------------------------------===//
#include "llvm/ADT/StringSwitch.h"
+#include "llvm/ADT/SmallSet.h"
#include "llvm/BinaryFormat/COFF.h"
#include "llvm/BinaryFormat/MachO.h"
#include "llvm/BinaryFormat/XCOFF.h"
@@ -1615,12 +1616,11 @@ static void dumpSymbolsFromDLInfoMachO(MachOObjectFile &MachO,
}
// See if these addresses are already in the symbol table.
unsigned FunctionStartsAdded = 0;
+ SmallSet<uint64_t, 32> SymbolAddresses;
+ for (unsigned J = 0; J < SymbolList.size(); ++J)
+ SymbolAddresses.insert(SymbolList[J].Address);
for (uint64_t f = 0; f < FoundFns.size(); f++) {
- bool found = false;
- for (unsigned J = 0; J < SymbolList.size() && !found; ++J) {
- if (SymbolList[J].Address == FoundFns[f] + BaseSegmentAddress)
- found = true;
- }
+ bool found = SymbolAddresses.contains(FoundFns[f] + BaseSegmentAddress);
// See this address is not already in the symbol table fake up an
// nlist for it.
if (!found) {
``````````
</details>
https://github.com/llvm/llvm-project/pull/162755
More information about the llvm-commits
mailing list