[clang] b2f7b5d - [CodeGen] Support bitcode input containing multiple modules
Fangrui Song via cfe-commits
cfe-commits at lists.llvm.org
Fri Jul 21 20:05:40 PDT 2023
Author: Fangrui Song
Date: 2023-07-21T20:05:35-07:00
New Revision: b2f7b5dbaefe4f2e3f8f279735ea3509a796693f
URL: https://github.com/llvm/llvm-project/commit/b2f7b5dbaefe4f2e3f8f279735ea3509a796693f
DIFF: https://github.com/llvm/llvm-project/commit/b2f7b5dbaefe4f2e3f8f279735ea3509a796693f.diff
LOG: [CodeGen] Support bitcode input containing multiple modules
When using -fsplit-lto-unit (explicitly specified or due to using
-fsanitize=cfi/-fwhole-program-vtables), the emitted LLVM IR contains a module
flag metadata `"EnableSplitLTOUnit"`. If a module contains both type metadata
and `"EnableSplitLTOUnit"`, `ThinLTOBitcodeWriter.cpp` will write two modules
into the bitcode file. Compiling the bitcode (not ThinLTO backend compilation)
will lead to an error due to `parseIR` requiring a single module.
```
% clang -flto=thin a.cc -c -o a.bc
% clang -c a.bc
% clang -fsplit-lto-unit -flto=thin a.cc -c -o a.bc
% clang -c a.bc
error: Expected a single module
1 error generated.
```
There are multiple ways to have just one module in a bitcode file
output: `-Xclang -fno-lto-unit`, not using features like `-fsanitize=cfi`,
using `-fsanitize=cfi` with `-fno-split-lto-unit`. I think whether a
bitcode input file contains 2 modules (internal implementation strategy)
should not be a criterion to require an additional driver option when
the user seek for a non-LTO compile action.
Let's place the extra module (if present) into CodeGenOptions::LinkBitcodeFiles
(originally for -cc1 -mlink-bitcode-file). Linker::linkModules will link the two
modules together. This patch makes the following commands work:
```
clang -S -emit-llvm a.bc
clang -S a.bc
clang -c a.bc
```
Reviewed By: ormris
Differential Revision: https://reviews.llvm.org/D154923
Added:
clang/test/CodeGen/split-lto-unit-input.cpp
Modified:
clang/lib/CodeGen/CodeGenAction.cpp
Removed:
################################################################################
diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp
index 4879bcd6a42a52..a3b72381d73fc5 100644
--- a/clang/lib/CodeGen/CodeGenAction.cpp
+++ b/clang/lib/CodeGen/CodeGenAction.cpp
@@ -1112,21 +1112,21 @@ CodeGenAction::loadModule(MemoryBufferRef MBRef) {
CompilerInstance &CI = getCompilerInstance();
SourceManager &SM = CI.getSourceManager();
+ auto DiagErrors = [&](Error E) -> std::unique_ptr<llvm::Module> {
+ unsigned DiagID =
+ CI.getDiagnostics().getCustomDiagID(DiagnosticsEngine::Error, "%0");
+ handleAllErrors(std::move(E), [&](ErrorInfoBase &EIB) {
+ CI.getDiagnostics().Report(DiagID) << EIB.message();
+ });
+ return {};
+ };
+
// For ThinLTO backend invocations, ensure that the context
// merges types based on ODR identifiers. We also need to read
// the correct module out of a multi-module bitcode file.
if (!CI.getCodeGenOpts().ThinLTOIndexFile.empty()) {
VMContext->enableDebugTypeODRUniquing();
- auto DiagErrors = [&](Error E) -> std::unique_ptr<llvm::Module> {
- unsigned DiagID =
- CI.getDiagnostics().getCustomDiagID(DiagnosticsEngine::Error, "%0");
- handleAllErrors(std::move(E), [&](ErrorInfoBase &EIB) {
- CI.getDiagnostics().Report(DiagID) << EIB.message();
- });
- return {};
- };
-
Expected<std::vector<BitcodeModule>> BMsOrErr = getBitcodeModuleList(MBRef);
if (!BMsOrErr)
return DiagErrors(BMsOrErr.takeError());
@@ -1151,10 +1151,35 @@ CodeGenAction::loadModule(MemoryBufferRef MBRef) {
if (loadLinkModules(CI))
return nullptr;
+ // Handle textual IR and bitcode file with one single module.
llvm::SMDiagnostic Err;
if (std::unique_ptr<llvm::Module> M = parseIR(MBRef, Err, *VMContext))
return M;
+ // If MBRef is a bitcode with multiple modules (e.g., -fsplit-lto-unit
+ // output), place the extra modules (actually only one, a regular LTO module)
+ // into LinkModules as if we are using -mlink-bitcode-file.
+ Expected<std::vector<BitcodeModule>> BMsOrErr = getBitcodeModuleList(MBRef);
+ if (BMsOrErr && BMsOrErr->size()) {
+ std::unique_ptr<llvm::Module> FirstM;
+ for (auto &BM : *BMsOrErr) {
+ Expected<std::unique_ptr<llvm::Module>> MOrErr =
+ BM.parseModule(*VMContext);
+ if (!MOrErr)
+ return DiagErrors(MOrErr.takeError());
+ if (FirstM)
+ LinkModules.push_back({std::move(*MOrErr), /*PropagateAttrs=*/false,
+ /*Internalize=*/false, /*LinkFlags=*/{}});
+ else
+ FirstM = std::move(*MOrErr);
+ }
+ if (FirstM)
+ return FirstM;
+ }
+ // If BMsOrErr fails, consume the error and use the error message from
+ // parseIR.
+ consumeError(BMsOrErr.takeError());
+
// Translate from the diagnostic info to the SourceManager location if
// available.
// TODO: Unify this with ConvertBackendLocation()
diff --git a/clang/test/CodeGen/split-lto-unit-input.cpp b/clang/test/CodeGen/split-lto-unit-input.cpp
new file mode 100644
index 00000000000000..adfc9ac3e0f446
--- /dev/null
+++ b/clang/test/CodeGen/split-lto-unit-input.cpp
@@ -0,0 +1,32 @@
+// REQUIRES: x86-registered-target
+/// When the input is a -fsplit-lto-unit bitcode file, link the regular LTO file like -mlink-bitcode-file.
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm-bc -flto=thin -flto-unit -fsplit-lto-unit %s -o %t.bc
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-obj %t.bc -o %t.o
+// RUN: llvm-nm %t.o | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm %t.bc -o - | FileCheck %s --check-prefix=CHECK-IR
+
+// CHECK: V _ZTI1A
+// CHECK-NEXT: V _ZTI1B
+// CHECK-NEXT: V _ZTS1A
+// CHECK-NEXT: V _ZTS1B
+// CHECK-NEXT: V _ZTV1A
+// CHECK-NEXT: V _ZTV1B
+
+// CHECK-IR-DAG: _ZTS1B = linkonce_odr constant
+// CHECK-IR-DAG: _ZTS1A = linkonce_odr constant
+// CHECK-IR-DAG: _ZTV1B = linkonce_odr unnamed_addr constant
+// CHECK-IR-DAG: _ZTI1A = linkonce_odr constant
+// CHECK-IR-DAG: _ZTI1B = linkonce_odr constant
+// CHECK-IR-DAG: _ZTV1A = linkonce_odr unnamed_addr constant
+
+struct A {
+ virtual int c(int i) = 0;
+};
+
+struct B : A {
+ virtual int c(int i) { return i; }
+};
+
+int use() {
+ return (new B)->c(0);
+}
More information about the cfe-commits
mailing list