[clang] [clang-rel] Clone the llvm::Modules to avoid invalid memory access. (PR #89031)

Vassil Vassilev via cfe-commits cfe-commits at lists.llvm.org
Wed Apr 17 00:33:26 PDT 2024


https://github.com/vgvassilev created https://github.com/llvm/llvm-project/pull/89031

Clang's CodeGen is designed to work with a single llvm::Module. In many cases for convenience various CodeGen parts have a reference to the llvm::Module (TheModule or Module) which does not change when a new module is pushed. However, the execution engine wants to take ownership of the module which does not map well to CodeGen's design. To work this around we clone the module and pass it down.

With some effort it is possible to teach CodeGen to ask the CodeGenModule for its current module and that would have an overall positive impact on CodeGen improving the encapsulation of various parts but that's not resilient to future regression.

This patch takes a more conservative approach and clones the llvm::Module before passing it to the Jit. That's also not bullet proof because we have to guarantee that CodeGen does not write on the blueprint. At that stage that seems more consistent to what clang-repl already does to map each partial translation unit to a new Module.

This change will fixes a long-standing invalid memory access reported by valgrind when we enable the TBAA optimization passes. It also unblock progress on llvm/llvm-project#84758.

>From 4805b4b11fc57bb7523b620c0a770d082e1dde2a Mon Sep 17 00:00:00 2001
From: Vassil Vassilev <v.g.vassilev at gmail.com>
Date: Wed, 17 Apr 2024 07:12:40 +0000
Subject: [PATCH] [clang-rel] Clone the llvm::Modules to avoid invalid memory
 access.

Clang's CodeGen is designed to work with a single llvm::Module. In many cases
for convenience various CodeGen parts have a reference to the llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module which does
not map well to CodeGen's design. To work this around we clone the module and
pass it down.

With some effort it is possible to teach CodeGen to ask the CodeGenModule for
its current module and that would have an overall positive impact on CodeGen
improving the encapsulation of various parts but that's not resilient to future
regression.

This patch takes a more conservative approach and clones the llvm::Module before
passing it to the Jit. That's also not bullet proof because we have to guarantee
that CodeGen does not write on the blueprint. At that stage that seems more
consistent to what clang-repl already does to map each partial translation unit
to a new Module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock progress
on llvm/llvm-project#84758.
---
 clang/lib/Interpreter/IncrementalExecutor.cpp | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Interpreter/IncrementalExecutor.cpp b/clang/lib/Interpreter/IncrementalExecutor.cpp
index 6f036107c14a9c..e87f43f077f379 100644
--- a/clang/lib/Interpreter/IncrementalExecutor.cpp
+++ b/clang/lib/Interpreter/IncrementalExecutor.cpp
@@ -28,6 +28,7 @@
 #include "llvm/IR/Module.h"
 #include "llvm/Support/ManagedStatic.h"
 #include "llvm/Support/TargetSelect.h"
+#include "llvm/Transforms/Utils/Cloning.h"
 
 // Force linking some of the runtimes that helps attaching to a debugger.
 LLVM_ATTRIBUTE_USED void linkComponents() {
@@ -73,7 +74,15 @@ llvm::Error IncrementalExecutor::addModule(PartialTranslationUnit &PTU) {
       Jit->getMainJITDylib().createResourceTracker();
   ResourceTrackers[&PTU] = RT;
 
-  return Jit->addIRModule(RT, {std::move(PTU.TheModule), TSCtx});
+  // Clang's CodeGen is designed to work with a single llvm::Module. In many
+  // cases for convenience various CodeGen parts have a reference to the
+  // llvm::Module (TheModule or Module) which does not change when a new module
+  // is pushed. However, the execution engine wants to take ownership of the
+  // module which does not map well to CodeGen's design. To work this around
+  // we clone the module and pass it down.
+  std::unique_ptr<llvm::Module> ModuleClone = llvm::CloneModule(*PTU.TheModule);
+
+  return Jit->addIRModule(RT, {std::move(ModuleClone), TSCtx});
 }
 
 llvm::Error IncrementalExecutor::removeModule(PartialTranslationUnit &PTU) {



More information about the cfe-commits mailing list