[clang] 87a4215 - [Clang] Always verify LLVM IR inputs (#134396)
via cfe-commits
cfe-commits at lists.llvm.org
Mon Apr 7 00:18:51 PDT 2025
Author: Nikita Popov
Date: 2025-04-07T09:18:47+02:00
New Revision: 87a4215ed154e867683b10c8d7fe1dbc79d81abb
URL: https://github.com/llvm/llvm-project/commit/87a4215ed154e867683b10c8d7fe1dbc79d81abb
DIFF: https://github.com/llvm/llvm-project/commit/87a4215ed154e867683b10c8d7fe1dbc79d81abb.diff
LOG: [Clang] Always verify LLVM IR inputs (#134396)
We get a lot of issues that basically boil down to "I passed malformed
LLVM IR to clang and it crashed". Clang does not perform IR verification
by default in (non-assertion-enabled) release builds, and that's
sensible for IR that Clang itself produces, which is expected to always
be valid. However, if people pass in their own handwritten IR, we should
report if it is malformed, instead of crashing. We should also report it
in a way that does not produce a crash trace and ask for a bug report,
as currently happens in assertions-enabled builds. This aligns the
behavior with how opt/llc work.
Added:
clang/test/CodeGen/invalid_llvm_ir.ll
Modified:
clang/include/clang/Basic/DiagnosticFrontendKinds.td
clang/lib/CodeGen/CodeGenAction.cpp
Removed:
################################################################################
diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 5f64b1cbfac87..6c72775197823 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -379,6 +379,8 @@ def err_ast_action_on_llvm_ir : Error<
"cannot apply AST actions to LLVM IR file '%0'">,
DefaultFatal;
+def err_invalid_llvm_ir : Error<"invalid LLVM IR input: %0">;
+
def err_os_unsupport_riscv_fmv : Error<
"function multiversioning is currently only supported on Linux">;
diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp
index 4321efd49af36..1f5eb427b566f 100644
--- a/clang/lib/CodeGen/CodeGenAction.cpp
+++ b/clang/lib/CodeGen/CodeGenAction.cpp
@@ -39,6 +39,7 @@
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/LLVMRemarkStreamer.h"
#include "llvm/IR/Module.h"
+#include "llvm/IR/Verifier.h"
#include "llvm/IRReader/IRReader.h"
#include "llvm/LTO/LTOBackend.h"
#include "llvm/Linker/Linker.h"
@@ -1048,8 +1049,17 @@ CodeGenAction::loadModule(MemoryBufferRef MBRef) {
// Handle textual IR and bitcode file with one single module.
llvm::SMDiagnostic Err;
- if (std::unique_ptr<llvm::Module> M = parseIR(MBRef, Err, *VMContext))
+ if (std::unique_ptr<llvm::Module> M = parseIR(MBRef, Err, *VMContext)) {
+ // For LLVM IR files, always verify the input and report the error in a way
+ // that does not ask people to report an issue for it.
+ std::string VerifierErr;
+ raw_string_ostream VerifierErrStream(VerifierErr);
+ if (llvm::verifyModule(*M, &VerifierErrStream)) {
+ CI.getDiagnostics().Report(diag::err_invalid_llvm_ir) << VerifierErr;
+ return {};
+ }
return M;
+ }
// If MBRef is a bitcode with multiple modules (e.g., -fsplit-lto-unit
// output), place the extra modules (actually only one, a regular LTO module)
diff --git a/clang/test/CodeGen/invalid_llvm_ir.ll b/clang/test/CodeGen/invalid_llvm_ir.ll
new file mode 100644
index 0000000000000..97a6802bc105e
--- /dev/null
+++ b/clang/test/CodeGen/invalid_llvm_ir.ll
@@ -0,0 +1,12 @@
+; RUN: not %clang %s 2>&1 | FileCheck %s
+; RUN: llvm-as -disable-verify < %s > %t.bc
+; RUN: not %clang %t.bc 2>&1 | FileCheck %s
+
+; CHECK: error: invalid LLVM IR input: PHINode should have one entry for each predecessor of its parent basic block!
+; CHECK-NEXT: %phi = phi i32 [ 0, %entry ]
+
+define void @test() {
+entry:
+ %phi = phi i32 [ 0, %entry ]
+ ret void
+}
More information about the cfe-commits
mailing list