[PATCH] D40518: [CodeView] Re-write TypeSerializer and TypeTableBuilder

Zachary Turner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 27 13:41:41 PST 2017


zturner created this revision.
Herald added subscribers: JDevlieghere, hiraditya, mgorny.

The motivation behind this patch is that future directions require us to be able to compute the hash value of records independently of actually using them for de-duplication.

The current structure of `TypeSerializer` / `TypeTableBuilder` being a single entry point that takes an unserialized type record, and then hashes and de-duplicates it is not flexible enough to allow this.

At the same time, the existing `TypeSerializer` is already extremely complex for this very reason -- it tries to be too many things.  In addition to serializing, hashing, and de-duplicating, ti also supports splitting up field list records and adding continuations.  All of this functionality crammed into this one class makes it very complicated to work with and hard to maintain.

To solve all of these problems, I've re-written everything from scratch and split the functionality into separate pieces that can easily be reused.  The end result is that one class `TypeSerializer` is turned into 3 new classes `SimpleTypeSerializer`, `ContinuationRecordBuilder`, and `TypeTableBuilder`, each of which in isolation is simple and straightforward.

A quick summary of these new classes and their responsibilities are:

- `SimpleTypeSerializer` : Turns a non-FieldList leaf type into a series of bytes.  Does not do any hashing.  Every time you call it, it will re-serialize and return bytes again.  The same instance can be re-used over and over to avoid re-allocations, and in exchange for this optimization the bytes returned by the serializer only live until the caller attempts to serialize a new record.
- `ContinuationRecordBuilder` : Turns a FieldList-like record into a series of fragments.  Does not do any hashing.  Like `SimpleTypeSerializer`, returns references to privately owned bytes, so the storage is invalidated as soon as the caller tries to re-use the instance.  Works equally well for `LF_FIELDLIST` as it does for `LF_METHODLIST`, solving a long-standing theoretical limitation of the previous implementation.
- `TypeTableBuilder` - Accepts sequences of bytes that the user has already serialized, and inserts them by de-duplicating with a hash table.  For the sake of convenience and efficiency, this class internally stores a `SimpleTypeSerializer` so that it can accept unserialized records.  The same is not true of `ContinuationRecordBuilder`.  The user is required to create their own instance of `ContinuationRecordBuilder`.

Because of the separation of responsibilities, the proposed algorithm for content hashing now becomes straightforward to implement.

  auto Hashes = readHashStream(Obj);
  auto Types = readTypeStream(Obj);
  for (int I=0; I < Hashes.size(); ++I) {
     TTB.insert_as(Types[I], Hashes[I]);
  }


https://reviews.llvm.org/D40518

Files:
  llvm/include/llvm/DebugInfo/CodeView/ContinuationRecordBuilder.h
  llvm/include/llvm/DebugInfo/CodeView/SimpleTypeSerializer.h
  llvm/include/llvm/DebugInfo/CodeView/TypeSerializer.h
  llvm/include/llvm/DebugInfo/CodeView/TypeTableBuilder.h
  llvm/include/llvm/ObjectYAML/CodeViewYAMLTypes.h
  llvm/include/llvm/Support/BinaryByteStream.h
  llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
  llvm/lib/DebugInfo/CodeView/CMakeLists.txt
  llvm/lib/DebugInfo/CodeView/ContinuationRecordBuilder.cpp
  llvm/lib/DebugInfo/CodeView/SimpleTypeSerializer.cpp
  llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp
  llvm/lib/DebugInfo/CodeView/TypeSerializer.cpp
  llvm/lib/DebugInfo/CodeView/TypeStreamMerger.cpp
  llvm/lib/DebugInfo/CodeView/TypeTableBuilder.cpp
  llvm/lib/DebugInfo/CodeView/TypeTableCollection.cpp
  llvm/lib/ObjectYAML/CodeViewYAMLTypes.cpp
  llvm/test/DebugInfo/COFF/big-type.ll
  llvm/tools/llvm-pdbutil/PdbYaml.cpp
  llvm/tools/llvm-pdbutil/llvm-pdbutil.cpp
  llvm/unittests/DebugInfo/CodeView/RandomAccessVisitorTest.cpp
  llvm/unittests/DebugInfo/CodeView/TypeIndexDiscoveryTest.cpp
  llvm/unittests/Support/BinaryStreamTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D40518.124455.patch
Type: text/x-patch
Size: 80243 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171127/13afe31e/attachment.bin>


More information about the llvm-commits mailing list