[lld] [lld/COFF] Fix -start-lib / -end-lib more after reviews.llvm.org/D116434 (PR #124294)

Nico Weber via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 24 07:38:30 PST 2025


https://github.com/nico created https://github.com/llvm/llvm-project/pull/124294

This is a follow-up to #120452 in a way.

Since lld/COFF does not yet insert all defined in an obj file before all undefineds (ELF and MachO do this, see #67445 and things linked from there), it's possible that:

1. We add an obj file a.obj
2. a.obj contains an undefined that's in b.obj, causing b.obj to be added
3. b.obj contains an undefined that's in a part of a.obj that's not yet in the symbol table, causing a recursive load of a.obj, which adds the symbols in there twice, leading to duplicate symbol errors.

For normal archives, `ArchiveFile::addMember()` has a `seen` check to prevent this. For start-lib lazy objects, we can just check if the archive is still lazy at the recursive call.

This bug is similar to issue #59162.

(Eventually, we'll probably want to do what the MachO and ELF ports do.)

Includes a test that caused duplicate symbol diagnostics before this code change.

>From c3bc43cffa70869ebdebda8f8eeefc96475b7262 Mon Sep 17 00:00:00 2001
From: Nico Weber <thakis at chromium.org>
Date: Fri, 24 Jan 2025 10:30:45 -0500
Subject: [PATCH] [lld/COFF] Fix -start-lib / -end-lib more after
 reviews.llvm.org/D116434

This is a follow-up to #120452 in a way.

Since lld/COFF does not yet insert all defined in an obj file before
all undefineds (ELF and MachO do this, see #67445 and things linked
from there), it's possible that:

1. We add an obj file a.obj
2. a.obj contains an undefined that's in b.obj, causing b.obj to be
   added
3. b.obj contains an undefined that's in a part of a.obj that's not
   yet in the symbol table, causing a recursive load of a.obj,
   which adds the symbols in there twice, leading to duplicate symbol
   errors.

For normal archives, `ArchiveFile::addMember()` has a `seen` check
to prevent this. For start-lib lazy objects, we can just check if
the archive is still lazy at the recursive call.

This bug is similar to issue #59162.

(Eventually, we'll probably want to do what the MachO and ELF ports
do.)

Includes a test that caused duplicate symbol diagnostics before this
code change.
---
 lld/COFF/SymbolTable.cpp   |  2 ++
 lld/test/COFF/start-lib.ll | 69 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 71 insertions(+)

diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 32ea4a5b2e1fc3..4ade347875fde3 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -56,6 +56,8 @@ static void forceLazy(Symbol *s) {
   }
   case Symbol::Kind::LazyObjectKind: {
     InputFile *file = cast<LazyObject>(s)->file;
+    if (!file->lazy)
+      return;
     file->lazy = false;
     file->symtab.ctx.driver.addFile(file);
     break;
diff --git a/lld/test/COFF/start-lib.ll b/lld/test/COFF/start-lib.ll
index a46147f21ccbb3..134cdc2a6e1df6 100644
--- a/lld/test/COFF/start-lib.ll
+++ b/lld/test/COFF/start-lib.ll
@@ -173,3 +173,72 @@ target triple = "x86_64-pc-windows-msvc"
 define void @baz() {
   ret void
 }
+
+
+; Check cycles between symbols in two /start-lib files.
+; If the links succeed and does not emit duplicate symbol diagnostics,
+; that's enough.
+
+; RUN: llc -filetype=obj %t.dir/main3.ll -o %t-main3.obj
+; RUN: llc -filetype=obj %t.dir/cycle1.ll -o %t-cycle1.obj
+; RUN: llc -filetype=obj %t.dir/cycle2.ll -o %t-cycle2.obj
+; RUN: opt -thinlto-bc %t.dir/main3.ll -o %t-main3.bc
+; RUN: opt -thinlto-bc %t.dir/cycle1.ll -o %t-cycle1.bc
+; RUN: opt -thinlto-bc %t.dir/cycle2.ll -o %t-cycle2.bc
+
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.obj %t-cycle1.obj %t-cycle2.obj
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.obj /start-lib %t-cycle1.obj %t-cycle2.obj /end-lib
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     /start-lib %t-cycle1.obj %t-cycle2.obj /end-lib %t-main3.obj
+
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.bc %t-cycle1.bc %t-cycle2.bc
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.bc /start-lib %t-cycle1.bc %t-cycle2.bc /end-lib
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     /start-lib %t-cycle1.bc %t-cycle2.bc /end-lib %t-main3.bc
+
+#--- main3.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @foo1()
+
+define void @main() {
+  call void () @foo1()
+  ret void
+}
+
+#--- cycle1.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @bar()
+
+define void @foo1() {
+  ; cycle1.ll pulls in cycle2.ll for bar(), and cycle2.ll then pulls in
+  ; cycle1.ll again for foo2().
+  call void () @bar()
+  ret void
+}
+
+define void @foo2() {
+  ret void
+}
+
+
+#--- cycle2.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @foo2()
+
+define void @bar() {
+  call void () @foo2()
+  ret void
+}



More information about the llvm-commits mailing list