[Mlir-commits] [mlir] [mlir][bytecode] Fix SourceMgr lifetime extension in BytecodeReader (PR #185321)
llvmlistbot at llvm.org
llvmlistbot at llvm.org
Sun Mar 8 14:12:31 PDT 2026
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-mlir
Author: Denis Gurchenkov (dgurchenkov)
<details>
<summary>Changes</summary>
## Problem
`BytecodeReader::Impl` stored `bufferOwnerRef` as a **reference** to a `shared_ptr<SourceMgr>` rather than by value:
```cpp
// Before (bug):
const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef;
```
A reference to a `shared_ptr` does **not** increment the reference count, so `Impl` never participated in shared ownership of the `SourceMgr`. The comment above the field ("may be used to extend the lifetime") described the intent, but the reference semantics made it a no-op.
This causes a use-after-free during lazy loading: the caller's `shared_ptr` can go out of scope after constructing the `BytecodeReader`, dropping the refcount to zero and freeing the `SourceMgr` and its backing buffer, while `Impl` still holds a dangling reference to it. Any subsequent lazy materialization then accesses freed memory.
## Fix
Store `bufferOwnerRef` by value so `Impl` is a co-owner of the `SourceMgr`:
```cpp
// After (fix):
const std::shared_ptr<llvm::SourceMgr> bufferOwnerRef;
```
The copy in `Impl`'s constructor increments the refcount, keeping the `SourceMgr` alive for the lifetime of the reader.
## Test
Added `Bytecode.SourceMgrLifetimeExtendedByReader` to `mlir/unittests/Bytecode/BytecodeTest.cpp`. The test constructs a `BytecodeReader` inside a scope that destroys the `SourceMgr` immediately after construction, then calls `readTopLevel` outside that scope. Without the fix this crashes with a segfault (UAF); with the fix all 5 bytecode unit tests pass.
## Note on `ParsedResourceEntry`
`ParsedResourceEntry` also stores `bufferOwnerRef` as a `const shared_ptr &`, but this is intentional: entries are stack-local objects that live only for a single loop iteration inside `parseResourceGroup`. The only escape — a blob deleter lambda in `parseAsBlob` — already captures the `shared_ptr` by value (copy), so lifetime extension works correctly there.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---
Full diff: https://github.com/llvm/llvm-project/pull/185321.diff
3 Files Affected:
- (modified) mlir/lib/Bytecode/Reader/BytecodeReader.cpp (+1-1)
- (modified) mlir/unittests/Bytecode/BytecodeTest.cpp (+34)
- (modified) mlir/unittests/Bytecode/CMakeLists.txt (+1)
``````````diff
diff --git a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
index c974603bb9235..9708dec665046 100644
--- a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
+++ b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
@@ -1907,7 +1907,7 @@ class mlir::BytecodeReader::Impl {
/// The optional owning source manager, which when present may be used to
/// extend the lifetime of the input buffer.
- const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef;
+ const std::shared_ptr<llvm::SourceMgr> bufferOwnerRef;
};
LogicalResult BytecodeReader::Impl::read(
diff --git a/mlir/unittests/Bytecode/BytecodeTest.cpp b/mlir/unittests/Bytecode/BytecodeTest.cpp
index 148868801da33..25af104f843c7 100644
--- a/mlir/unittests/Bytecode/BytecodeTest.cpp
+++ b/mlir/unittests/Bytecode/BytecodeTest.cpp
@@ -19,7 +19,9 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/Alignment.h"
#include "llvm/Support/Endian.h"
+#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/MemoryBufferRef.h"
+#include "llvm/Support/SourceMgr.h"
#include "llvm/Support/raw_ostream.h"
#include "gmock/gmock.h"
#include "gtest/gtest.h"
@@ -231,6 +233,38 @@ TEST(Bytecode, OpWithoutProperties) {
OperationEquivalence::computeHash(roundtripped));
}
+TEST(Bytecode, SourceMgrLifetimeExtendedByReader) {
+ MLIRContext context;
+
+ // Create a trivial module and serialize it to bytecode.
+ OwningOpRef<ModuleOp> module =
+ ModuleOp::create(UnknownLoc::get(&context));
+ std::string bytecode;
+ llvm::raw_string_ostream os(bytecode);
+ ASSERT_TRUE(succeeded(writeBytecodeToFile(module.get(), os)));
+
+ // Build a BytecodeReader whose SourceMgr goes out of scope immediately after
+ // construction. The reader must keep it alive via shared ownership, otherwise
+ // readTopLevel will access freed memory (UAF).
+ ParserConfig config(&context);
+ std::shared_ptr<BytecodeReader> reader;
+ {
+ auto sourceMgr = std::make_shared<llvm::SourceMgr>();
+ auto buffer = llvm::MemoryBuffer::getMemBufferCopy(bytecode, "model");
+ sourceMgr->AddNewSourceBuffer(std::move(buffer), llvm::SMLoc());
+ llvm::MemoryBufferRef bufRef =
+ *sourceMgr->getMemoryBuffer(sourceMgr->getMainFileID());
+
+ reader = std::make_shared<BytecodeReader>(
+ bufRef, config, /*lazyLoad=*/true, sourceMgr);
+ // sourceMgr destroyed here — reader must have extended its lifetime.
+ }
+
+ Block block;
+ EXPECT_TRUE(succeeded(
+ reader->readTopLevel(&block, [](Operation *) { return true; })));
+}
+
TEST(Bytecode, DeepCallSiteLoc) {
MLIRContext context;
ParserConfig config(&context);
diff --git a/mlir/unittests/Bytecode/CMakeLists.txt b/mlir/unittests/Bytecode/CMakeLists.txt
index 92bb5fe3c158d..d6b559a3aa908 100644
--- a/mlir/unittests/Bytecode/CMakeLists.txt
+++ b/mlir/unittests/Bytecode/CMakeLists.txt
@@ -5,5 +5,6 @@ mlir_target_link_libraries(MLIRBytecodeTests
PRIVATE
MLIRBytecodeReader
MLIRBytecodeWriter
+ MLIRBytecodeOpInterface
MLIRParser
)
``````````
</details>
https://github.com/llvm/llvm-project/pull/185321
More information about the Mlir-commits
mailing list