[lld] [LLD] [COFF] Error out if new LTO objects are pulled in after the main LTO compilation (PR #71337)

via llvm-commits llvm-commits at lists.llvm.org
Sun Nov 5 14:41:21 PST 2023


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-lto

Author: Martin Storsjö (mstorsjo)

<details>
<summary>Changes</summary>

Normally, this shouldn't happen. It can happen in exceptional circumstances, if the compiled output of a bitcode object file references symbols that weren't listed as undefined in the bitcode object file itself.

This can at least happen in the following cases:
- A custom SEH personality is set via asm()
- Compiler generated calls to builtin helper functions, such as __chkstk, or __rt_sdiv on arm

Both of these produce undefined references to symbols after compiling to a regular object file, that aren't visible on the level of the IR object file.

This is only an issue if the referenced symbols are provided as LTO objects themselves; loading regular object files after the LTO compilation works fine.

Custom SEH personalities are rare, but one CRT startup file in mingw-w64 does this. The referenced pesonality function is usually provided via an import library, but for WinStore targets, a local dummy reimplementation in C is used, which can be an LTO object.

Generated calls to builtins is very common, but the builtins aren't usually provided as LTO objects (compiler-rt's builtins explicitly pass -fno-lto when building), and many of the builtins are provided as raw .S assembly files, which don't get built as LTO objects anyway, even if built with -flto.

If hitting this unusual, but possible, situation, error out cleanly with a clear message rather than crashing.

---
Full diff: https://github.com/llvm/llvm-project/pull/71337.diff


4 Files Affected:

- (modified) lld/COFF/SymbolTable.cpp (+5) 
- (modified) lld/COFF/SymbolTable.h (+1) 
- (added) lld/test/COFF/lto-late-arm.ll (+39) 
- (added) lld/test/COFF/lto-late-personality.ll (+48) 


``````````diff
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index d9673e159769fbd..15c76461a84a92c 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -61,6 +61,10 @@ void SymbolTable::addFile(InputFile *file) {
     if (auto *f = dyn_cast<ObjFile>(file)) {
       ctx.objFileInstances.push_back(f);
     } else if (auto *f = dyn_cast<BitcodeFile>(file)) {
+      if (ltoCompilationDone) {
+        error("LTO object file " + toString(file) + " linked in after "
+              "doing LTO compilation.");
+      }
       ctx.bitcodeFileInstances.push_back(f);
     } else if (auto *f = dyn_cast<ImportFile>(file)) {
       ctx.importFileInstances.push_back(f);
@@ -876,6 +880,7 @@ Symbol *SymbolTable::addUndefined(StringRef name) {
 }
 
 void SymbolTable::compileBitcodeFiles() {
+  ltoCompilationDone = true;
   if (ctx.bitcodeFileInstances.empty())
     return;
 
diff --git a/lld/COFF/SymbolTable.h b/lld/COFF/SymbolTable.h
index 511e60d1e3a0873..fc623c2840d401b 100644
--- a/lld/COFF/SymbolTable.h
+++ b/lld/COFF/SymbolTable.h
@@ -133,6 +133,7 @@ class SymbolTable {
 
   llvm::DenseMap<llvm::CachedHashStringRef, Symbol *> symMap;
   std::unique_ptr<BitcodeCompiler> lto;
+  bool ltoCompilationDone = false;
 
   COFFLinkerContext &ctx;
 };
diff --git a/lld/test/COFF/lto-late-arm.ll b/lld/test/COFF/lto-late-arm.ll
new file mode 100644
index 000000000000000..0e2f148ef74c6c8
--- /dev/null
+++ b/lld/test/COFF/lto-late-arm.ll
@@ -0,0 +1,39 @@
+; REQUIRES: arm
+
+;; A bitcode file can generate undefined references to symbols that weren't
+;; listed as undefined on the bitcode file itself, when lowering produces
+;; calls to e.g. builtin helper functions. If these functions are provided
+;; as LTO bitcode, the linker would hit an unhandled state. (In practice,
+;; compiler-rt builtins are always compiled with -fno-lto, so this shouldn't
+;; happen.)
+
+; RUN: rm -rf %t.dir
+; RUN: split-file %s %t.dir
+; RUN: llvm-as %t.dir/main.ll -o %t.main.obj
+; RUN: llvm-as %t.dir/sdiv.ll -o %t.sdiv.obj
+; RUN: llvm-ar rcs %t.sdiv.lib %t.sdiv.obj
+
+; RUN: env LLD_IN_TEST=1 not lld-link /entry:entry %t.main.obj %t.sdiv.lib /out:%t.exe /subsystem:console 2>&1 | FileCheck %s
+
+; CHECK: error: LTO object file lto-late-arm.ll.tmp.sdiv.lib(lto-late-arm.ll.tmp.sdiv.obj) linked in after doing LTO compilation.
+
+;--- main.ll
+target datalayout = "e-m:w-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv7-w64-windows-gnu"
+
+ at num = dso_local global i32 100
+
+define dso_local arm_aapcs_vfpcc i32 @entry(i32 %param) {
+entry:
+  %0 = load i32, ptr @num
+  %div = sdiv i32 %0, %param
+  ret i32 %div
+}
+;--- sdiv.ll
+target datalayout = "e-m:w-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv7-w64-windows-gnu"
+
+define dso_local arm_aapcs_vfpcc void @__rt_sdiv() {
+entry:
+  ret void
+}
diff --git a/lld/test/COFF/lto-late-personality.ll b/lld/test/COFF/lto-late-personality.ll
new file mode 100644
index 000000000000000..f768f65f4521e40
--- /dev/null
+++ b/lld/test/COFF/lto-late-personality.ll
@@ -0,0 +1,48 @@
+; REQUIRES: x86
+
+;; A bitcode file can generate undefined references to symbols that weren't
+;; listed as undefined on the bitcode file itself, if there's a reference to
+;; an unexpected personality routine via asm(). If the personality function
+;; is provided as LTO bitcode, the linker would hit an unhandled state.
+
+; RUN: rm -rf %t.dir
+; RUN: split-file %s %t.dir
+; RUN: llvm-as %t.dir/main.ll -o %t.main.obj
+; RUN: llvm-as %t.dir/other.ll -o %t.other.obj
+; RUN: llvm-as %t.dir/personality.ll -o %t.personality.obj
+; RUN: llvm-ar rcs %t.personality.lib %t.personality.obj
+
+; RUN: env LLD_IN_TEST=1 not lld-link /entry:entry %t.main.obj %t.other.obj %t.personality.lib /out:%t.exe /subsystem:console /opt:lldlto=0 /debug:symtab 2>&1 | FileCheck %s
+
+; CHECK: error: LTO object file lto-late-personality.ll.tmp.personality.lib(lto-late-personality.ll.tmp.personality.obj) linked in after doing LTO compilation.
+
+;--- main.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-w64-windows-gnu"
+
+define i32 @entry() {
+entry:
+  tail call void @other()
+  tail call void asm sideeffect ".seh_handler __C_specific_handler, @except\0A", "~{dirflag},~{fpsr},~{flags}"()
+  ret i32 0
+}
+
+declare dso_local void @other()
+
+;--- other.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-w64-windows-gnu"
+
+define dso_local void @other() {
+entry:
+  ret void
+}
+
+;--- personality.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-w64-windows-gnu"
+
+define void @__C_specific_handler() {
+entry:
+  ret void
+}

``````````

</details>


https://github.com/llvm/llvm-project/pull/71337


More information about the llvm-commits mailing list