[PATCH] D22693: More strongly separate the PDB reading interfaces and PDB writing interfaces

Zachary Turner via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 22 11:51:34 PDT 2016


zturner created this revision.
zturner added reviewers: rnk, ruiu, amccarth.
zturner added a subscriber: llvm-commits.

This is a huge patch, so instead of asking you guys to look at it in detail, I'll summarize the highlights (I also have 7 other smaller patches leading up to this which are not in yet.  Point being that I tried as hard as I could to break it down into something smaller, but this is the best I could do).  Mostly looking for high level feedback.  Even this is long, but it's better than reading a 5,000 line patch :)

Problem: A separation started to occur between reading interfaces and writing interfaces.  E.g. `PDBFile`, `DbiStream`, `InfoStream`, etc were the reading interfaces, and `PDBFileBuilder`, `MsfBuilder`, `InfoStreamBuilder`, etc were the writing interfaces.  But due to the coupling of the reading interfaces and the underlying file buffer, the only way to write changes back out to disk was to do so through the reading interface.  So we had functions like `PDBFile::commit()`, and `DbiStream::commit()`, even though these were supposed to be the reading interfaces.  The way you would use the api is to create a `PdbFileBuilder`, fill out the fields, call `PdbFileBuilder::build()` on it and that would return you a `PDBFile`.  Then you would call `commit()` on it.

So the problem was to get these `commit()` methods over to the builders, so that the reading interfaces would be truly read only.

Solution: Until this patch, there was no clear distinction between interfaces which could read and interfaces which could write.  Everything could read and everything could write.  This causes a problem though, because your backing data is not always writable to begin with.  For example, when reading a PDB from disk and dumping it, we read it into a `MemoryBuffer`, which is immutable.  So we would have to use hacks like returning an `llvm::Error` if the backing store is not writable.  It's sort of analagous to if everything was a `MutableArrayRef<>`, but methods returned an error if the underlying memory was const.  Ideally you want the types to dictate what operations are possible.

With that in mind, I split all the stream based types into readable streams, writable streams, and read/write streams.  This is supported by the following fundamental types:

```
/// An abstract base class which defines an interface for reading data from an arbitrary backing store which may or may not be contiguous.
class ReadableStream

/// An abstract base class which defines an interface for writing data to an arbitrary backing store which may or may not be contiguous.
class WritableStream

/// A stream which is backed by a source which is both readable and writable.
/// This is mostly equivalent to what was formerly known as `StreamInterface`.
class ReadWriteStream : public ReadableStream, public WritableStream
```

These are the basis for everything.  From here there are concrete implementations that are backed by contiguous in-memory byte sequences (`ByteStream` and `MutableByteStream`) which are useful when you are building your own streams, or when you have a single buffer which contains code view type data, etc.

There are concrete implementations that are backed by discontiguous block streams in an Msf file (`MappedBlockStream` and `WritableMappedBlockStream`).

There are concrete implementations that are backed by disk files (`FileBufferByteStream`) that are useful when you want to commit to a disk.

But it's not always useful to just have an **entire** stream.  Sometimes you just want part of a stream.  For example, maybe starting at offset 364 in a big stream is a "substream".  We want some way to get a smaller view over a larger stream.  For this we have `ReadableStreamRef`, `WritableStreamRef`, and `ReadWriteStreamRef`.  `StreamRefs` are to Streams as `ArrayRefs` are to arrays.  They can be trivially copied and passed by value, and provide convenience methods for dropping items, slicing, etc.  But they are backed by `ReadableStream`, `WritableStream`, and `ReadWriteStream`, which again means they can refer to any kind of stream, whether it be a `ByteStream`, a `MappedBlockStream`, or any other kind of stream.

Finally, we have `StreamReader` and `StreamWriter`.  While the underlying stream only knows how to read and write byte sequences, `StreamReader` and `StreamWriter` know how to read *types*.  integers, objects, strings, etc.  They wrap `ReadableStreamRef` and `WritableStreamRef` respectively, and maintain an offset into the stream where the next read or write will occur at.

these abstractions largely mirror the abstractions of before this patch, with the exception that reading and writing responsibilities are enforced by the type system.

There is one breaking change to be aware of between this patch and the previous patch.  Previously, `MappedBlockStream` (which supported both reading and writing at the time) handled the case where you read across a block boundary by allocating from an internal pool.  And if you ever requested that offset again, it would return the value allocated from the pool.  Since it also supported writing, it had to handle the case where you overwrote some of the values that were previously cached from the pool.  Because otherwise, if someone were to try to read that value again, they would see it was cached and return the stale value from the cache.  So `MappedBlockStream` was smart enough to deal with this.

But now, `MappedBlockStream` is read-only and `WritableMappedBlockStream` is write-only, and just like if you make 2 `unique_ptrs` to the same data, bad things will happen.  The only use case I had found for this advanced cache coherency was in simplicity of writing tests, so it is now up to the user to make sure he/she coordinates reading and writing appropriately and explicitly clears the read cache after any write is performed on the same backing stream.

https://reviews.llvm.org/D22693

Files:
  include/llvm/DebugInfo/CodeView/CVRecord.h
  include/llvm/DebugInfo/CodeView/ModuleSubstream.h
  include/llvm/DebugInfo/CodeView/ModuleSubstreamVisitor.h
  include/llvm/DebugInfo/Msf/ByteStream.h
  include/llvm/DebugInfo/Msf/MappedBlockStream.h
  include/llvm/DebugInfo/Msf/MsfBuilder.h
  include/llvm/DebugInfo/Msf/MsfCommon.h
  include/llvm/DebugInfo/Msf/StreamArray.h
  include/llvm/DebugInfo/Msf/StreamInterface.h
  include/llvm/DebugInfo/Msf/StreamReader.h
  include/llvm/DebugInfo/Msf/StreamRef.h
  include/llvm/DebugInfo/Msf/StreamWriter.h
  include/llvm/DebugInfo/PDB/PDBTypes.h
  include/llvm/DebugInfo/PDB/Raw/DbiStream.h
  include/llvm/DebugInfo/PDB/Raw/DbiStreamBuilder.h
  include/llvm/DebugInfo/PDB/Raw/InfoStream.h
  include/llvm/DebugInfo/PDB/Raw/InfoStreamBuilder.h
  include/llvm/DebugInfo/PDB/Raw/ModInfo.h
  include/llvm/DebugInfo/PDB/Raw/ModStream.h
  include/llvm/DebugInfo/PDB/Raw/NameHashTable.h
  include/llvm/DebugInfo/PDB/Raw/NameMap.h
  include/llvm/DebugInfo/PDB/Raw/NameMapBuilder.h
  include/llvm/DebugInfo/PDB/Raw/PDBFile.h
  include/llvm/DebugInfo/PDB/Raw/PDBFileBuilder.h
  include/llvm/DebugInfo/PDB/Raw/RawTypes.h
  lib/DebugInfo/CodeView/ModuleSubstream.cpp
  lib/DebugInfo/CodeView/ModuleSubstreamVisitor.cpp
  lib/DebugInfo/CodeView/TypeDumper.cpp
  lib/DebugInfo/Msf/ByteStream.cpp
  lib/DebugInfo/Msf/CMakeLists.txt
  lib/DebugInfo/Msf/MappedBlockStream.cpp
  lib/DebugInfo/Msf/MsfBuilder.cpp
  lib/DebugInfo/Msf/StreamReader.cpp
  lib/DebugInfo/Msf/StreamWriter.cpp
  lib/DebugInfo/PDB/Raw/DbiStream.cpp
  lib/DebugInfo/PDB/Raw/DbiStreamBuilder.cpp
  lib/DebugInfo/PDB/Raw/InfoStream.cpp
  lib/DebugInfo/PDB/Raw/InfoStreamBuilder.cpp
  lib/DebugInfo/PDB/Raw/ModInfo.cpp
  lib/DebugInfo/PDB/Raw/ModStream.cpp
  lib/DebugInfo/PDB/Raw/NameMap.cpp
  lib/DebugInfo/PDB/Raw/NameMapBuilder.cpp
  lib/DebugInfo/PDB/Raw/PDBFile.cpp
  lib/DebugInfo/PDB/Raw/PDBFileBuilder.cpp
  lib/DebugInfo/PDB/Raw/RawSession.cpp
  lib/DebugInfo/PDB/Raw/TpiStream.cpp
  tools/llvm-pdbdump/LLVMOutputStyle.cpp
  tools/llvm-pdbdump/llvm-pdbdump.cpp
  tools/llvm-readobj/COFFDumper.cpp
  unittests/DebugInfo/PDB/MappedBlockStreamTest.cpp
  unittests/DebugInfo/PDB/MsfBuilderTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D22693.65108.patch
Type: text/x-patch
Size: 127691 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160722/3f6b7abc/attachment-0001.bin>


More information about the llvm-commits mailing list