[PATCH] D20654: [pdb] Try super hard to conserve memory in llvm-pdbdump

Wed May 25 17:07:25 PDT 2016

zturner created this revision.
zturner added reviewers: rnk, ruiu.
zturner added a subscriber: llvm-commits.

We map the entire PDB into the process, and then when reading various streams, in order to guarantee that we have certain structures in a contiguous format, we read them out of the stream and order them appropriately, essentially copying the bytes.

This is incredibly memory inefficient, as it essentially means that loading a PDB into memory will use roughly double the file size since almost all important data structures are copied.

To address this, I've introduced a number of steps:

1. Introduce a class called `StreamView`.  This is analagous to an `ArrayRef`.  It provides a limited window on top of a larger stream (which can be any type of stream, including another `StreamView`.  This is useful for constraining stream operations to specific substreams or fields, for example when one large stream is broken into multiple logical sections.

2. Introduce a set of 3 "stream data structures".  These are currently `StreamString`, `FixedStreamArray<T>`, and `VarStreamArray`.  These classes all share the same underlying purpose:  Try to return references to values in the source byte stream if it the needed done contiguously, but if not, copy them into temporary storage.  The first one wraps a String, the second one wraps an array of fixed size records, and the third one wraps an array of variable length records.  `VarStreamArray` will prove particularly useful, because in order to return a reference to the data in the source byte stream, the entire array need not be contiguous, only the single record being requested.  

3. Update the `StreamReader` class to be able to read values of type `StreamString, `FixedStreamArray<T>`, and `VarStreamArray`.  

I updated `DBIStream` to use these new classes in a number of places, but currently there is not much memory savings because it's not yet being used on the type and symbol records stream, which comprise 95% of the file size.  I plan to do that in a subsequent patch but I just wanted to get the infrastructure in place first.

http://reviews.llvm.org/D20654

Files:
  include/llvm/DebugInfo/CodeView/StreamArray.h
  include/llvm/DebugInfo/CodeView/StreamReader.h
  include/llvm/DebugInfo/CodeView/StreamString.h
  include/llvm/DebugInfo/CodeView/StreamView.h
  include/llvm/DebugInfo/PDB/Raw/DbiStream.h
  include/llvm/DebugInfo/PDB/Raw/ModInfo.h
  lib/DebugInfo/CodeView/CMakeLists.txt
  lib/DebugInfo/CodeView/StreamArray.cpp
  lib/DebugInfo/CodeView/StreamReader.cpp
  lib/DebugInfo/CodeView/StreamString.cpp
  lib/DebugInfo/PDB/Raw/DbiStream.cpp
  tools/llvm-pdbdump/llvm-pdbdump.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D20654.58540.patch
Type: text/x-patch
Size: 21630 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160526/b1dc3dd1/attachment-0001.bin>