[Mlir-commits] [mlir] [mlir][bytecode] Fix SourceMgr lifetime extension in BytecodeReader (PR #185321)

Denis Gurchenkov llvmlistbot at llvm.org
Sun Mar 8 14:11:45 PDT 2026


https://github.com/dgurchenkov created https://github.com/llvm/llvm-project/pull/185321

## Problem

`BytecodeReader::Impl` stored `bufferOwnerRef` as a **reference** to a `shared_ptr<SourceMgr>` rather than by value:

```cpp
// Before (bug):
const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef;
```

A reference to a `shared_ptr` does **not** increment the reference count, so `Impl` never participated in shared ownership of the `SourceMgr`. The comment above the field ("may be used to extend the lifetime") described the intent, but the reference semantics made it a no-op.

This causes a use-after-free during lazy loading: the caller's `shared_ptr` can go out of scope after constructing the `BytecodeReader`, dropping the refcount to zero and freeing the `SourceMgr` and its backing buffer, while `Impl` still holds a dangling reference to it. Any subsequent lazy materialization then accesses freed memory.

## Fix

Store `bufferOwnerRef` by value so `Impl` is a co-owner of the `SourceMgr`:

```cpp
// After (fix):
const std::shared_ptr<llvm::SourceMgr> bufferOwnerRef;
```

The copy in `Impl`'s constructor increments the refcount, keeping the `SourceMgr` alive for the lifetime of the reader.

## Test

Added `Bytecode.SourceMgrLifetimeExtendedByReader` to `mlir/unittests/Bytecode/BytecodeTest.cpp`. The test constructs a `BytecodeReader` inside a scope that destroys the `SourceMgr` immediately after construction, then calls `readTopLevel` outside that scope. Without the fix this crashes with a segfault (UAF); with the fix all 5 bytecode unit tests pass.

## Note on `ParsedResourceEntry`

`ParsedResourceEntry` also stores `bufferOwnerRef` as a `const shared_ptr &`, but this is intentional: entries are stack-local objects that live only for a single loop iteration inside `parseResourceGroup`. The only escape — a blob deleter lambda in `parseAsBlob` — already captures the `shared_ptr` by value (copy), so lifetime extension works correctly there.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

>From 3796b834dac5f1bb11dd50ec9badf88f64549e97 Mon Sep 17 00:00:00 2001
From: Denis Gurchenkov <denis at modular.com>
Date: Sun, 8 Mar 2026 14:35:02 -0400
Subject: [PATCH 1/2] [mlir][bytecode] Fix SourceMgr lifetime extension in
 BytecodeReader

BytecodeReader::Impl stored bufferOwnerRef as a const reference to a
shared_ptr rather than by value, so it never incremented the reference
count. This caused the SourceMgr (and its backing buffer) to be
destroyed at the end of the caller's scope while Impl still held a
dangling reference to it, leading to use-after-free during lazy loading.

Fix by storing bufferOwnerRef by value so Impl participates in shared
ownership and keeps the SourceMgr alive for its own lifetime.

Co-Authored-By: Claude Sonnet 4.6 <noreply at anthropic.com>
---
 mlir/lib/Bytecode/Reader/BytecodeReader.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
index c974603bb9235..9708dec665046 100644
--- a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
+++ b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
@@ -1907,7 +1907,7 @@ class mlir::BytecodeReader::Impl {
 
   /// The optional owning source manager, which when present may be used to
   /// extend the lifetime of the input buffer.
-  const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef;
+  const std::shared_ptr<llvm::SourceMgr> bufferOwnerRef;
 };
 
 LogicalResult BytecodeReader::Impl::read(

>From ad81e15c7782ace7688a0b4392669fb76e62247f Mon Sep 17 00:00:00 2001
From: Denis Gurchenkov <denis at modular.com>
Date: Sun, 8 Mar 2026 17:10:55 -0400
Subject: [PATCH 2/2] [mlir][bytecode] Add test for SourceMgr lifetime
 extension in BytecodeReader

Add a unit test that verifies BytecodeReader keeps the SourceMgr alive
after the caller's shared_ptr goes out of scope, which is required for
correct lazy loading behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply at anthropic.com>
---
 mlir/unittests/Bytecode/BytecodeTest.cpp | 34 ++++++++++++++++++++++++
 mlir/unittests/Bytecode/CMakeLists.txt   |  1 +
 2 files changed, 35 insertions(+)

diff --git a/mlir/unittests/Bytecode/BytecodeTest.cpp b/mlir/unittests/Bytecode/BytecodeTest.cpp
index 148868801da33..25af104f843c7 100644
--- a/mlir/unittests/Bytecode/BytecodeTest.cpp
+++ b/mlir/unittests/Bytecode/BytecodeTest.cpp
@@ -19,7 +19,9 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Support/Alignment.h"
 #include "llvm/Support/Endian.h"
+#include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/MemoryBufferRef.h"
+#include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/raw_ostream.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
@@ -231,6 +233,38 @@ TEST(Bytecode, OpWithoutProperties) {
               OperationEquivalence::computeHash(roundtripped));
 }
 
+TEST(Bytecode, SourceMgrLifetimeExtendedByReader) {
+  MLIRContext context;
+
+  // Create a trivial module and serialize it to bytecode.
+  OwningOpRef<ModuleOp> module =
+      ModuleOp::create(UnknownLoc::get(&context));
+  std::string bytecode;
+  llvm::raw_string_ostream os(bytecode);
+  ASSERT_TRUE(succeeded(writeBytecodeToFile(module.get(), os)));
+
+  // Build a BytecodeReader whose SourceMgr goes out of scope immediately after
+  // construction. The reader must keep it alive via shared ownership, otherwise
+  // readTopLevel will access freed memory (UAF).
+  ParserConfig config(&context);
+  std::shared_ptr<BytecodeReader> reader;
+  {
+    auto sourceMgr = std::make_shared<llvm::SourceMgr>();
+    auto buffer = llvm::MemoryBuffer::getMemBufferCopy(bytecode, "model");
+    sourceMgr->AddNewSourceBuffer(std::move(buffer), llvm::SMLoc());
+    llvm::MemoryBufferRef bufRef =
+        *sourceMgr->getMemoryBuffer(sourceMgr->getMainFileID());
+
+    reader = std::make_shared<BytecodeReader>(
+        bufRef, config, /*lazyLoad=*/true, sourceMgr);
+    // sourceMgr destroyed here — reader must have extended its lifetime.
+  }
+
+  Block block;
+  EXPECT_TRUE(succeeded(
+      reader->readTopLevel(&block, [](Operation *) { return true; })));
+}
+
 TEST(Bytecode, DeepCallSiteLoc) {
   MLIRContext context;
   ParserConfig config(&context);
diff --git a/mlir/unittests/Bytecode/CMakeLists.txt b/mlir/unittests/Bytecode/CMakeLists.txt
index 92bb5fe3c158d..d6b559a3aa908 100644
--- a/mlir/unittests/Bytecode/CMakeLists.txt
+++ b/mlir/unittests/Bytecode/CMakeLists.txt
@@ -5,5 +5,6 @@ mlir_target_link_libraries(MLIRBytecodeTests
   PRIVATE
   MLIRBytecodeReader
   MLIRBytecodeWriter
+  MLIRBytecodeOpInterface
   MLIRParser
 )



More information about the Mlir-commits mailing list