[Mlir-commits] [mlir] [mlir][bytecode] Fix SourceMgr lifetime extension in BytecodeReader (PR #185321)

llvmlistbot at llvm.org llvmlistbot at llvm.org
Sun Mar 8 14:12:31 PDT 2026


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-mlir

Author: Denis Gurchenkov (dgurchenkov)

<details>
<summary>Changes</summary>

## Problem

`BytecodeReader::Impl` stored `bufferOwnerRef` as a **reference** to a `shared_ptr<SourceMgr>` rather than by value:

```cpp
// Before (bug):
const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef;
```

A reference to a `shared_ptr` does **not** increment the reference count, so `Impl` never participated in shared ownership of the `SourceMgr`. The comment above the field ("may be used to extend the lifetime") described the intent, but the reference semantics made it a no-op.

This causes a use-after-free during lazy loading: the caller's `shared_ptr` can go out of scope after constructing the `BytecodeReader`, dropping the refcount to zero and freeing the `SourceMgr` and its backing buffer, while `Impl` still holds a dangling reference to it. Any subsequent lazy materialization then accesses freed memory.

## Fix

Store `bufferOwnerRef` by value so `Impl` is a co-owner of the `SourceMgr`:

```cpp
// After (fix):
const std::shared_ptr<llvm::SourceMgr> bufferOwnerRef;
```

The copy in `Impl`'s constructor increments the refcount, keeping the `SourceMgr` alive for the lifetime of the reader.

## Test

Added `Bytecode.SourceMgrLifetimeExtendedByReader` to `mlir/unittests/Bytecode/BytecodeTest.cpp`. The test constructs a `BytecodeReader` inside a scope that destroys the `SourceMgr` immediately after construction, then calls `readTopLevel` outside that scope. Without the fix this crashes with a segfault (UAF); with the fix all 5 bytecode unit tests pass.

## Note on `ParsedResourceEntry`

`ParsedResourceEntry` also stores `bufferOwnerRef` as a `const shared_ptr &`, but this is intentional: entries are stack-local objects that live only for a single loop iteration inside `parseResourceGroup`. The only escape — a blob deleter lambda in `parseAsBlob` — already captures the `shared_ptr` by value (copy), so lifetime extension works correctly there.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---
Full diff: https://github.com/llvm/llvm-project/pull/185321.diff


3 Files Affected:

- (modified) mlir/lib/Bytecode/Reader/BytecodeReader.cpp (+1-1) 
- (modified) mlir/unittests/Bytecode/BytecodeTest.cpp (+34) 
- (modified) mlir/unittests/Bytecode/CMakeLists.txt (+1) 


``````````diff
diff --git a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
index c974603bb9235..9708dec665046 100644
--- a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
+++ b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
@@ -1907,7 +1907,7 @@ class mlir::BytecodeReader::Impl {
 
   /// The optional owning source manager, which when present may be used to
   /// extend the lifetime of the input buffer.
-  const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef;
+  const std::shared_ptr<llvm::SourceMgr> bufferOwnerRef;
 };
 
 LogicalResult BytecodeReader::Impl::read(
diff --git a/mlir/unittests/Bytecode/BytecodeTest.cpp b/mlir/unittests/Bytecode/BytecodeTest.cpp
index 148868801da33..25af104f843c7 100644
--- a/mlir/unittests/Bytecode/BytecodeTest.cpp
+++ b/mlir/unittests/Bytecode/BytecodeTest.cpp
@@ -19,7 +19,9 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Support/Alignment.h"
 #include "llvm/Support/Endian.h"
+#include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/MemoryBufferRef.h"
+#include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/raw_ostream.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
@@ -231,6 +233,38 @@ TEST(Bytecode, OpWithoutProperties) {
               OperationEquivalence::computeHash(roundtripped));
 }
 
+TEST(Bytecode, SourceMgrLifetimeExtendedByReader) {
+  MLIRContext context;
+
+  // Create a trivial module and serialize it to bytecode.
+  OwningOpRef<ModuleOp> module =
+      ModuleOp::create(UnknownLoc::get(&context));
+  std::string bytecode;
+  llvm::raw_string_ostream os(bytecode);
+  ASSERT_TRUE(succeeded(writeBytecodeToFile(module.get(), os)));
+
+  // Build a BytecodeReader whose SourceMgr goes out of scope immediately after
+  // construction. The reader must keep it alive via shared ownership, otherwise
+  // readTopLevel will access freed memory (UAF).
+  ParserConfig config(&context);
+  std::shared_ptr<BytecodeReader> reader;
+  {
+    auto sourceMgr = std::make_shared<llvm::SourceMgr>();
+    auto buffer = llvm::MemoryBuffer::getMemBufferCopy(bytecode, "model");
+    sourceMgr->AddNewSourceBuffer(std::move(buffer), llvm::SMLoc());
+    llvm::MemoryBufferRef bufRef =
+        *sourceMgr->getMemoryBuffer(sourceMgr->getMainFileID());
+
+    reader = std::make_shared<BytecodeReader>(
+        bufRef, config, /*lazyLoad=*/true, sourceMgr);
+    // sourceMgr destroyed here — reader must have extended its lifetime.
+  }
+
+  Block block;
+  EXPECT_TRUE(succeeded(
+      reader->readTopLevel(&block, [](Operation *) { return true; })));
+}
+
 TEST(Bytecode, DeepCallSiteLoc) {
   MLIRContext context;
   ParserConfig config(&context);
diff --git a/mlir/unittests/Bytecode/CMakeLists.txt b/mlir/unittests/Bytecode/CMakeLists.txt
index 92bb5fe3c158d..d6b559a3aa908 100644
--- a/mlir/unittests/Bytecode/CMakeLists.txt
+++ b/mlir/unittests/Bytecode/CMakeLists.txt
@@ -5,5 +5,6 @@ mlir_target_link_libraries(MLIRBytecodeTests
   PRIVATE
   MLIRBytecodeReader
   MLIRBytecodeWriter
+  MLIRBytecodeOpInterface
   MLIRParser
 )

``````````

</details>


https://github.com/llvm/llvm-project/pull/185321


More information about the Mlir-commits mailing list