[lld] [ELF] Fix unnecessary inclusion of unreferenced provide symbols (PR #84512)

via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 18 02:32:11 PDT 2024


================
@@ -2659,6 +2659,58 @@ static void postParseObjectFile(ELFFileBase *file) {
   }
 }
 
+// Returns true if the provide symbol should be added to the link.
+bool shouldAddProvideSym(StringRef symName) {
+  Symbol *b = symtab.find(symName);
+  return b && !b->isDefined() && !b->isCommon();
+}
+
+// Add symbols referred by the provide symbol to the symbol table.
+// This function must only be called for provide symbols that should be added
+// to the link.
+static void
+addProvideSymReferences(StringRef provideSym,
+                        llvm::StringSet<> &addedRefsFromProvideSym) {
+
+  if (addedRefsFromProvideSym.count(provideSym))
+    return;
+  assert(shouldAddProvideSym(provideSym) &&
+         "This function must only be called for provide symbols that should be "
+         "added to the link.");
+  addedRefsFromProvideSym.insert(provideSym);
+  for (StringRef name : script->provideMap[provideSym].keys()) {
+    Symbol *sym = addUnusedUndefined(name);
+    sym->isUsedInRegularObj = true;
+    sym->referenced = true;
+    script->referencedSymbols.push_back(name);
+    if (script->provideMap.count(name) && shouldAddProvideSym(name) &&
+        !addedRefsFromProvideSym.count(name))
+      addProvideSymReferences(name, addedRefsFromProvideSym);
+  }
+}
+
+// Add symbols that are referenced in the linker script.
+// Symbols referenced in a PROVIDE command are only added to the symbol table if
+// the PROVIDE command actually provides the symbol.
+static void addScriptReferencedSymbols() {
+  // Some symbols (such as __ehdr_start) are defined lazily only when there
+  // are undefined symbols for them, so we add these to trigger that logic.
+  for (StringRef name : script->referencedSymbols) {
+    Symbol *sym = addUnusedUndefined(name);
+    sym->isUsedInRegularObj = true;
+    sym->referenced = true;
+  }
+
+  // Keeps track of references from which PROVIDE symbols have been added to the
+  // symbol table.
+  llvm::StringSet<> addedRefsFromProvideSym;
+  for (StringRef provideSym : script->provideMap.keys()) {
----------------
partaror wrote:

Thank you for the comment. I did not think about non-determinism previously. I have now changed the code to use `MapVector<StringRef, SmallVector<StringRef, 0>>`. 

I want to add one thing: Set is relatively more expensive than vector, however, it can help in avoiding more expensive computations. For example, currently, we store symbols referenced in the linker script in `SmallVector<llvm::StringRef, 0> LinkerScript::referencedSymbols;`. We have to do some processing on each symbol present in this list, such as, adding the symbol to the symbol table. If a symbol `S` is referenced 10 times in the linker script, then it will be present in `referencedSymbols` 10 times as well and therefore it will be processed 10 times. In such a case, would it be better to use Set instead of a vector?

https://github.com/llvm/llvm-project/pull/84512


More information about the llvm-commits mailing list