[llvm] r325386 - Fix emission of PDB string table.

Fri Feb 16 12:46:04 PST 2018

Author: zturner
Date: Fri Feb 16 12:46:04 2018
New Revision: 325386

URL: http://llvm.org/viewvc/llvm-project?rev=325386&view=rev
Log:
Fix emission of PDB string table.

This was originally reported as a bug with the symptom being "cvdump
crashes when printing an LLD-linked PDB that has an S_FILESTATIC record
in it". After some additional investigation, I determined that this was
a symptom of a larger problem, and in fact the real problem was in the
way we emitted the global PDB string table. As evidence of this, you can
take any lld-generated PDB, run cvdump -stringtable on it, and it would
return no results.

My hypothesis was that cvdump could not *find* the string table to begin
with. Normally it would do this by looking in the "named stream map",
finding the string /names, and using its value as the stream index. If
this lookup fails, then cvdump would fail to load the string table.

To test this hypothesis, I looked at the name stream map generated by a
link.exe PDB, and I emitted exactly those bytes into an LLD-generated
PDB. Suddenly, cvdump could read our string table!

This code has always been hacky and we knew there was something we
didn't understand. After all, there were some comments to the effect of
"we have to emit strings in a specific order, otherwise things don't
work". The key to fixing this was finally understanding this.

The way it works is that it makes use of a generic serializable hash map
that maps integers to other integers. In this case, the "key" is the
offset into a buffer, and the value is the stream number. If you index
into the buffer at the offset specified by a given key, you find the
name. The underlying cause of all these problems is that we were using
the identity function for the hash. i.e. if a string's offset in the
buffer was 12, the hash value was 12. Instead, we need to hash the
string *at that offset*. There is an additional catch, in that we have
to compute the hash as a uint32 and then truncate it to uint16.

Making this work is a little bit annoying, because we use the same hash
table in other places as well, and normally just using the identity
function for the hash function is actually what's desired. I'm not
totally happy with the template goo I came up with, but it works in any
case.

The reason we never found this bug through our own testing is because we
were building a /parallel/ hash table (in the form of an
llvm::StringMap<>) and doing all of our lookups and "real" hash table
work against that. I deleted all of that code and now everything goes
through the real hash table. Then, to test it, I added a unit test which
adds 7 strings and queries the associated values. I test every possible
insertion order permutation of these 7 strings, to verify that it really
does work as expected.

Differential Revision: https://reviews.llvm.org/D43326

Modified:
    llvm/trunk/include/llvm/DebugInfo/PDB/Native/HashTable.h
    llvm/trunk/include/llvm/DebugInfo/PDB/Native/InfoStream.h
    llvm/trunk/include/llvm/DebugInfo/PDB/Native/NamedStreamMap.h
    llvm/trunk/lib/DebugInfo/PDB/Native/HashTable.cpp
    llvm/trunk/lib/DebugInfo/PDB/Native/InfoStream.cpp
    llvm/trunk/lib/DebugInfo/PDB/Native/InfoStreamBuilder.cpp
    llvm/trunk/lib/DebugInfo/PDB/Native/NamedStreamMap.cpp
    llvm/trunk/tools/llvm-pdbutil/Diff.cpp
    llvm/trunk/unittests/DebugInfo/PDB/HashTableTest.cpp

Modified: llvm/trunk/include/llvm/DebugInfo/PDB/Native/HashTable.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/PDB/Native/HashTable.h?rev=325386&r1=325385&r2=325386&view=diff
==============================================================================

--- llvm/trunk/include/llvm/DebugInfo/PDB/Native/HashTable.h (original)
+++ llvm/trunk/include/llvm/DebugInfo/PDB/Native/HashTable.h Fri Feb 16 12:46:04 2018
@@ -54,10 +54,67 @@ public:
 
   HashTableIterator begin() const;
   HashTableIterator end() const;
-  HashTableIterator find(uint32_t K);
 
+  /// Find the entry with the specified key value.
+  HashTableIterator find(uint32_t K) const;
+
+  /// Find the entry whose key has the specified hash value, using the specified
+  /// traits defining hash function and equality.
+  template <typename Traits, typename Key, typename Context>
+  HashTableIterator find_as(const Key &K, const Context &Ctx) const {
+    uint32_t H = Traits::hash(K, Ctx) % capacity();
+    uint32_t I = H;
+    Optional<uint32_t> FirstUnused;
+    do {
+      if (isPresent(I)) {
+        if (Traits::realKey(Buckets[I].first, Ctx) == K)
+          return HashTableIterator(*this, I, false);
+      } else {
+        if (!FirstUnused)
+          FirstUnused = I;
+        // Insertion occurs via linear probing from the slot hint, and will be
+        // inserted at the first empty / deleted location.  Therefore, if we are
+        // probing and find a location that is neither present nor deleted, then
+        // nothing must have EVER been inserted at this location, and thus it is
+        // not possible for a matching value to occur later.
+        if (!isDeleted(I))
+          break;
+      }
+      I = (I + 1) % capacity();
+    } while (I != H);
+
+    // The only way FirstUnused would not be set is if every single entry in the
+    // table were Present.  But this would violate the load factor constraints
+    // that we impose, so it should never happen.
+    assert(FirstUnused);
+    return HashTableIterator(*this, *FirstUnused, true);
+  }
+
+  /// Set the entry with the specified key to the specified value.
   void set(uint32_t K, uint32_t V);
+
+  /// Set the entry using a key type that the specified Traits can convert
+  /// from a real key to an internal key.
+  template <typename Traits, typename Key, typename Context>
+  bool set_as(const Key &K, uint32_t V, Context &Ctx) {
+    return set_as_internal<Traits, Key, Context>(K, V, None, Ctx);
+  }
+
   void remove(uint32_t K);
+
+  template <typename Traits, typename Key, typename Context>
+  void remove_as(const Key &K, Context &Ctx) {
+    auto Iter = find_as<Traits, Key, Context>(K, Ctx);
+    // It wasn't here to begin with, just exit.
+    if (Iter == end())
+      return;
+
+    assert(Present.test(Iter.index()));
+    assert(!Deleted.test(Iter.index()));
+    Deleted.set(Iter.index());
+    Present.reset(Iter.index());
+  }
+
   uint32_t get(uint32_t K);
 
 protected:
@@ -69,8 +126,62 @@ protected:
   mutable SparseBitVector<> Deleted;
 
 private:
+  /// Set the entry using a key type that the specified Traits can convert
+  /// from a real key to an internal key.
+  template <typename Traits, typename Key, typename Context>
+  bool set_as_internal(const Key &K, uint32_t V, Optional<uint32_t> InternalKey,
+                       Context &Ctx) {
+    auto Entry = find_as<Traits, Key, Context>(K, Ctx);
+    if (Entry != end()) {
+      assert(isPresent(Entry.index()));
+      assert(Traits::realKey(Buckets[Entry.index()].first, Ctx) == K);
+      // We're updating, no need to do anything special.
+      Buckets[Entry.index()].second = V;
+      return false;
+    }
+
+    auto &B = Buckets[Entry.index()];
+    assert(!isPresent(Entry.index()));
+    assert(Entry.isEnd());
+    B.first = InternalKey ? *InternalKey : Traits::lowerKey(K, Ctx);
+    B.second = V;
+    Present.set(Entry.index());
+    Deleted.reset(Entry.index());
+
+    grow<Traits, Key, Context>(Ctx);
+
+    assert((find_as<Traits, Key, Context>(K, Ctx)) != end());
+    return true;
+  }
+
   static uint32_t maxLoad(uint32_t capacity);
-  void grow();
+
+  template <typename Traits, typename Key, typename Context>
+  void grow(Context &Ctx) {
+    uint32_t S = size();
+    if (S < maxLoad(capacity()))
+      return;
+    assert(capacity() != UINT32_MAX && "Can't grow Hash table!");
+
+    uint32_t NewCapacity =
+        (capacity() <= INT32_MAX) ? capacity() * 2 : UINT32_MAX;
+
+    // Growing requires rebuilding the table and re-hashing every item.  Make a
+    // copy with a larger capacity, insert everything into the copy, then swap
+    // it in.
+    HashTable NewMap(NewCapacity);
+    for (auto I : Present) {
+      auto RealKey = Traits::realKey(Buckets[I].first, Ctx);
+      NewMap.set_as_internal<Traits, Key, Context>(RealKey, Buckets[I].second,
+                                                   Buckets[I].first, Ctx);
+    }
+
+    Buckets.swap(NewMap.Buckets);
+    std::swap(Present, NewMap.Present);
+    std::swap(Deleted, NewMap.Deleted);
+    assert(capacity() == NewCapacity);
+    assert(size() == S);
+  }
 
   static Error readSparseBitVector(BinaryStreamReader &Stream,
                                    SparseBitVector<> &V);

Modified: llvm/trunk/include/llvm/DebugInfo/PDB/Native/InfoStream.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/PDB/Native/InfoStream.h?rev=325386&r1=325385&r2=325386&view=diff
==============================================================================
--- llvm/trunk/include/llvm/DebugInfo/PDB/Native/InfoStream.h (original)
+++ llvm/trunk/include/llvm/DebugInfo/PDB/Native/InfoStream.h Fri Feb 16 12:46:04 2018
@@ -51,7 +51,7 @@ public:
   BinarySubstreamRef getNamedStreamsBuffer() const;
 
   uint32_t getNamedStreamIndex(llvm::StringRef Name) const;
-  iterator_range<StringMapConstIterator<uint32_t>> named_streams() const;
+  StringMap<uint32_t> named_streams() const;
 
 private:
   std::unique_ptr<msf::MappedBlockStream> Stream;

Modified: llvm/trunk/include/llvm/DebugInfo/PDB/Native/NamedStreamMap.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/PDB/Native/NamedStreamMap.h?rev=325386&r1=325385&r2=325386&view=diff
==============================================================================
--- llvm/trunk/include/llvm/DebugInfo/PDB/Native/NamedStreamMap.h (original)
+++ llvm/trunk/include/llvm/DebugInfo/PDB/Native/NamedStreamMap.h Fri Feb 16 12:46:04 2018
@@ -28,29 +28,30 @@ namespace pdb {
 class NamedStreamMap {
   friend class NamedStreamMapBuilder;
 
-  struct FinalizationInfo {
-    uint32_t StringDataBytes = 0;
-    uint32_t SerializedLength = 0;
-  };
-
 public:
   NamedStreamMap();
 
   Error load(BinaryStreamReader &Stream);
   Error commit(BinaryStreamWriter &Writer) const;
-  uint32_t finalize();
+  uint32_t calculateSerializedLength() const;
 
   uint32_t size() const;
   bool get(StringRef Stream, uint32_t &StreamNo) const;
   void set(StringRef Stream, uint32_t StreamNo);
-  void remove(StringRef Stream);
-  const StringMap<uint32_t> &getStringMap() const { return Mapping; }
-  iterator_range<StringMapConstIterator<uint32_t>> entries() const;
+
+  uint32_t appendStringData(StringRef S);
+  StringRef getString(uint32_t Offset) const;
+  uint32_t hashString(uint32_t Offset) const;
+
+  StringMap<uint32_t> entries() const;
 
 private:
-  Optional<FinalizationInfo> FinalizedInfo;
-  HashTable FinalizedHashTable;
-  StringMap<uint32_t> Mapping;
+  /// Closed hash table from Offset -> StreamNumber, where Offset is the offset
+  /// of the stream name in NamesBuffer.
+  HashTable OffsetIndexMap;
+
+  /// Buffer of string data.
+  std::vector<char> NamesBuffer;
 };
 
 } // end namespace pdb

Modified: llvm/trunk/lib/DebugInfo/PDB/Native/HashTable.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/PDB/Native/HashTable.cpp?rev=325386&r1=325385&r2=325386&view=diff
==============================================================================
--- llvm/trunk/lib/DebugInfo/PDB/Native/HashTable.cpp (original)
+++ llvm/trunk/lib/DebugInfo/PDB/Native/HashTable.cpp Fri Feb 16 12:46:04 2018
@@ -22,6 +22,14 @@
 using namespace llvm;
 using namespace llvm::pdb;
 
+namespace {
+struct IdentityTraits {
+  static uint32_t hash(uint32_t K, const HashTable &Ctx) { return K; }
+  static uint32_t realKey(uint32_t K, const HashTable &Ctx) { return K; }
+  static uint32_t lowerKey(uint32_t K, const HashTable &Ctx) { return K; }
+};
+} // namespace
+
 HashTable::HashTable() : HashTable(8) {}
 
 HashTable::HashTable(uint32_t Capacity) { Buckets.resize(Capacity); }
@@ -119,70 +127,16 @@ HashTableIterator HashTable::end() const
   return HashTableIterator(*this, 0, true);
 }
 
-HashTableIterator HashTable::find(uint32_t K) {
-  uint32_t H = K % capacity();
-  uint32_t I = H;
-  Optional<uint32_t> FirstUnused;
-  do {
-    if (isPresent(I)) {
-      if (Buckets[I].first == K)
-        return HashTableIterator(*this, I, false);
-    } else {
-      if (!FirstUnused)
-        FirstUnused = I;
-      // Insertion occurs via linear probing from the slot hint, and will be
-      // inserted at the first empty / deleted location.  Therefore, if we are
-      // probing and find a location that is neither present nor deleted, then
-      // nothing must have EVER been inserted at this location, and thus it is
-      // not possible for a matching value to occur later.
-      if (!isDeleted(I))
-        break;
-    }
-    I = (I + 1) % capacity();
-  } while (I != H);
-
-  // The only way FirstUnused would not be set is if every single entry in the
-  // table were Present.  But this would violate the load factor constraints
-  // that we impose, so it should never happen.
-  assert(FirstUnused);
-  return HashTableIterator(*this, *FirstUnused, true);
+HashTableIterator HashTable::find(uint32_t K) const {
+  return find_as<IdentityTraits>(K, *this);
 }
 
 void HashTable::set(uint32_t K, uint32_t V) {
-  auto Entry = find(K);
-  if (Entry != end()) {
-    assert(isPresent(Entry.index()));
-    assert(Buckets[Entry.index()].first == K);
-    // We're updating, no need to do anything special.
-    Buckets[Entry.index()].second = V;
-    return;
-  }
-
-  auto &B = Buckets[Entry.index()];
-  assert(!isPresent(Entry.index()));
-  assert(Entry.isEnd());
-  B.first = K;
-  B.second = V;
-  Present.set(Entry.index());
-  Deleted.reset(Entry.index());
-
-  grow();
-
-  assert(find(K) != end());
-}
-
-void HashTable::remove(uint32_t K) {
-  auto Iter = find(K);
-  // It wasn't here to begin with, just exit.
-  if (Iter == end())
-    return;
-
-  assert(Present.test(Iter.index()));
-  assert(!Deleted.test(Iter.index()));
-  Deleted.set(Iter.index());
-  Present.reset(Iter.index());
+  set_as<IdentityTraits, uint32_t>(K, V, *this);
 }
 
+void HashTable::remove(uint32_t K) { remove_as<IdentityTraits>(K, *this); }
+
 uint32_t HashTable::get(uint32_t K) {
   auto I = find(K);
   assert(I != end());
@@ -191,30 +145,6 @@ uint32_t HashTable::get(uint32_t K) {
 
 uint32_t HashTable::maxLoad(uint32_t capacity) { return capacity * 2 / 3 + 1; }
 
-void HashTable::grow() {
-  uint32_t S = size();
-  if (S < maxLoad(capacity()))
-    return;
-  assert(capacity() != UINT32_MAX && "Can't grow Hash table!");
-
-  uint32_t NewCapacity =
-      (capacity() <= INT32_MAX) ? capacity() * 2 : UINT32_MAX;
-
-  // Growing requires rebuilding the table and re-hashing every item.  Make a
-  // copy with a larger capacity, insert everything into the copy, then swap
-  // it in.
-  HashTable NewMap(NewCapacity);
-  for (auto I : Present) {
-    NewMap.set(Buckets[I].first, Buckets[I].second);
-  }
-
-  Buckets.swap(NewMap.Buckets);
-  std::swap(Present, NewMap.Present);
-  std::swap(Deleted, NewMap.Deleted);
-  assert(capacity() == NewCapacity);
-  assert(size() == S);
-}
-
 Error HashTable::readSparseBitVector(BinaryStreamReader &Stream,
                                      SparseBitVector<> &V) {
   uint32_t NumWords;

Modified: llvm/trunk/lib/DebugInfo/PDB/Native/InfoStream.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/PDB/Native/InfoStream.cpp?rev=325386&r1=325385&r2=325386&view=diff
==============================================================================
--- llvm/trunk/lib/DebugInfo/PDB/Native/InfoStream.cpp (original)
+++ llvm/trunk/lib/DebugInfo/PDB/Native/InfoStream.cpp Fri Feb 16 12:46:04 2018
@@ -99,8 +99,7 @@ uint32_t InfoStream::getNamedStreamIndex
   return Result;
 }
 
-iterator_range<StringMapConstIterator<uint32_t>>
-InfoStream::named_streams() const {
+StringMap<uint32_t> InfoStream::named_streams() const {
   return NamedStreams.entries();
 }
 

Modified: llvm/trunk/lib/DebugInfo/PDB/Native/InfoStreamBuilder.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/PDB/Native/InfoStreamBuilder.cpp?rev=325386&r1=325385&r2=325386&view=diff
==============================================================================
--- llvm/trunk/lib/DebugInfo/PDB/Native/InfoStreamBuilder.cpp (original)
+++ llvm/trunk/lib/DebugInfo/PDB/Native/InfoStreamBuilder.cpp Fri Feb 16 12:46:04 2018
@@ -41,7 +41,8 @@ void InfoStreamBuilder::addFeature(PdbRa
 }
 
 Error InfoStreamBuilder::finalizeMsfLayout() {
-  uint32_t Length = sizeof(InfoStreamHeader) + NamedStreams.finalize() +
+  uint32_t Length = sizeof(InfoStreamHeader) +
+                    NamedStreams.calculateSerializedLength() +
                     (Features.size() + 1) * sizeof(uint32_t);
   if (auto EC = Msf.setStreamSize(StreamPDB, Length))
     return EC;

Modified: llvm/trunk/lib/DebugInfo/PDB/Native/NamedStreamMap.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/PDB/Native/NamedStreamMap.cpp?rev=325386&r1=325385&r2=325386&view=diff
==============================================================================
--- llvm/trunk/lib/DebugInfo/PDB/Native/NamedStreamMap.cpp (original)
+++ llvm/trunk/lib/DebugInfo/PDB/Native/NamedStreamMap.cpp Fri Feb 16 12:46:04 2018
@@ -11,6 +11,7 @@
 #include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/iterator_range.h"
+#include "llvm/DebugInfo/PDB/Native/Hash.h"
 #include "llvm/DebugInfo/PDB/Native/HashTable.h"
 #include "llvm/DebugInfo/PDB/Native/RawError.h"
 #include "llvm/Support/BinaryStreamReader.h"
@@ -26,127 +27,100 @@
 using namespace llvm;
 using namespace llvm::pdb;
 
-// FIXME: This shouldn't be necessary, but if we insert the strings in any
-// other order, cvdump cannot read the generated name map.  This suggests that
-// we may be using the wrong hash function.  A closer inspection of the cvdump
-// source code may reveal something, but for now this at least makes us work,
-// even if only by accident.
-static constexpr const char *OrderedStreamNames[] = {"/LinkInfo", "/names",
-                                                     "/src/headerblock"};
+namespace {
+struct NamedStreamMapTraits {
+  static uint16_t hash(StringRef S, const NamedStreamMap &NS) {
+    // In the reference implementation, this uses
+    // HASH Hasher<ULONG*, USHORT*>::hashPbCb(PB pb, size_t cb, ULONG ulMod).
+    // Here, the type HASH is a typedef of unsigned short.
+    // ** It is not a bug that we truncate the result of hashStringV1, in fact
+    //    it is a bug if we do not! **
+    return static_cast<uint16_t>(hashStringV1(S));
+  }
+  static StringRef realKey(uint32_t Offset, const NamedStreamMap &NS) {
+    return NS.getString(Offset);
+  }
+  static uint32_t lowerKey(StringRef S, NamedStreamMap &NS) {
+    return NS.appendStringData(S);
+  }
+};
+} // namespace
 
-NamedStreamMap::NamedStreamMap() = default;
+NamedStreamMap::NamedStreamMap() {}
 
 Error NamedStreamMap::load(BinaryStreamReader &Stream) {
-  Mapping.clear();
-  FinalizedHashTable.clear();
-  FinalizedInfo.reset();
-
   uint32_t StringBufferSize;
   if (auto EC = Stream.readInteger(StringBufferSize))
     return joinErrors(std::move(EC),
                       make_error<RawError>(raw_error_code::corrupt_file,
                                            "Expected string buffer size"));
 
-  BinaryStreamRef StringsBuffer;
-  if (auto EC = Stream.readStreamRef(StringsBuffer, StringBufferSize))
+  StringRef Buffer;
+  if (auto EC = Stream.readFixedString(Buffer, StringBufferSize))
     return EC;
+  NamesBuffer.assign(Buffer.begin(), Buffer.end());
 
-  HashTable OffsetIndexMap;
-  if (auto EC = OffsetIndexMap.load(Stream))
-    return EC;
-
-  uint32_t NameOffset;
-  uint32_t NameIndex;
-  for (const auto &Entry : OffsetIndexMap) {
-    std::tie(NameOffset, NameIndex) = Entry;
-
-    // Compute the offset of the start of the string relative to the stream.
-    BinaryStreamReader NameReader(StringsBuffer);
-    NameReader.setOffset(NameOffset);
-    // Pump out our c-string from the stream.
-    StringRef Str;
-    if (auto EC = NameReader.readCString(Str))
-      return joinErrors(std::move(EC),
-                        make_error<RawError>(raw_error_code::corrupt_file,
-                                             "Expected name map name"));
-
-    // Add this to a string-map from name to stream number.
-    Mapping.insert({Str, NameIndex});
-  }
-
-  return Error::success();
+  return OffsetIndexMap.load(Stream);
 }
 
 Error NamedStreamMap::commit(BinaryStreamWriter &Writer) const {
-  assert(FinalizedInfo.hasValue());
-
   // The first field is the number of bytes of string data.
-  if (auto EC = Writer.writeInteger(FinalizedInfo->StringDataBytes))
+  if (auto EC = Writer.writeInteger<uint32_t>(NamesBuffer.size()))
     return EC;
 
-  for (const auto &Name : OrderedStreamNames) {
-    auto Item = Mapping.find(Name);
-    if (Item == Mapping.end())
-      continue;
-    if (auto EC = Writer.writeCString(Item->getKey()))
-      return EC;
-  }
+  // Then the actual string data.
+  StringRef Data(NamesBuffer.data(), NamesBuffer.size());
+  if (auto EC = Writer.writeFixedString(Data))
+    return EC;
 
   // And finally the Offset Index map.
-  if (auto EC = FinalizedHashTable.commit(Writer))
+  if (auto EC = OffsetIndexMap.commit(Writer))
     return EC;
 
   return Error::success();
 }
 
-uint32_t NamedStreamMap::finalize() {
-  if (FinalizedInfo.hasValue())
-    return FinalizedInfo->SerializedLength;
-
-  // Build the finalized hash table.
-  FinalizedHashTable.clear();
-  FinalizedInfo.emplace();
-
-  for (const auto &Name : OrderedStreamNames) {
-    auto Item = Mapping.find(Name);
-    if (Item == Mapping.end())
-      continue;
-    FinalizedHashTable.set(FinalizedInfo->StringDataBytes, Item->getValue());
-    FinalizedInfo->StringDataBytes += Item->getKeyLength() + 1;
-  }
+uint32_t NamedStreamMap::calculateSerializedLength() const {
+  return sizeof(uint32_t)                              // String data size
+         + NamesBuffer.size()                          // String data
+         + OffsetIndexMap.calculateSerializedLength(); // Offset Index Map
+}
+
+uint32_t NamedStreamMap::size() const { return OffsetIndexMap.size(); }
 
-  // Number of bytes of string data.
-  FinalizedInfo->SerializedLength += sizeof(support::ulittle32_t);
-  // Followed by that many actual bytes of string data.
-  FinalizedInfo->SerializedLength += FinalizedInfo->StringDataBytes;
-  // Followed by the mapping from Offset to Index.
-  FinalizedInfo->SerializedLength +=
-      FinalizedHashTable.calculateSerializedLength();
-  return FinalizedInfo->SerializedLength;
-}
-
-iterator_range<StringMapConstIterator<uint32_t>>
-NamedStreamMap::entries() const {
-  return make_range<StringMapConstIterator<uint32_t>>(Mapping.begin(),
-                                                      Mapping.end());
+StringRef NamedStreamMap::getString(uint32_t Offset) const {
+  assert(NamesBuffer.size() > Offset);
+  return StringRef(NamesBuffer.data() + Offset);
 }
 
-uint32_t NamedStreamMap::size() const { return Mapping.size(); }
+uint32_t NamedStreamMap::hashString(uint32_t Offset) const {
+  return hashStringV1(getString(Offset));
+}
 
 bool NamedStreamMap::get(StringRef Stream, uint32_t &StreamNo) const {
-  auto Iter = Mapping.find(Stream);
-  if (Iter == Mapping.end())
+  auto Iter = OffsetIndexMap.find_as<NamedStreamMapTraits>(Stream, *this);
+  if (Iter == OffsetIndexMap.end())
     return false;
-  StreamNo = Iter->second;
+  StreamNo = (*Iter).second;
   return true;
 }
 
-void NamedStreamMap::set(StringRef Stream, uint32_t StreamNo) {
-  FinalizedInfo.reset();
-  Mapping[Stream] = StreamNo;
+StringMap<uint32_t> NamedStreamMap::entries() const {
+  StringMap<uint32_t> Result;
+  for (const auto &Entry : OffsetIndexMap) {
+    StringRef Stream(NamesBuffer.data() + Entry.first);
+    Result.try_emplace(Stream, Entry.second);
+  }
+  return Result;
 }
 
-void NamedStreamMap::remove(StringRef Stream) {
-  FinalizedInfo.reset();
-  Mapping.erase(Stream);
+uint32_t NamedStreamMap::appendStringData(StringRef S) {
+  uint32_t Offset = NamesBuffer.size();
+  NamesBuffer.insert(NamesBuffer.end(), S.begin(), S.end());
+  NamesBuffer.push_back('\0');
+  return Offset;
+}
+
+void NamedStreamMap::set(StringRef Stream, uint32_t StreamNo) {
+  OffsetIndexMap.set_as<NamedStreamMapTraits>(Stream, StreamNo, *this);
 }

Modified: llvm/trunk/tools/llvm-pdbutil/Diff.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-pdbutil/Diff.cpp?rev=325386&r1=325385&r2=325386&view=diff
==============================================================================
--- llvm/trunk/tools/llvm-pdbutil/Diff.cpp (original)
+++ llvm/trunk/tools/llvm-pdbutil/Diff.cpp Fri Feb 16 12:46:04 2018
@@ -417,8 +417,8 @@ Error DiffStyle::diffInfoStream() {
                        IS2.getFeatureSignatures());
   D.print("Named Stream Size", IS1.getNamedStreamMapByteSize(),
           IS2.getNamedStreamMapByteSize());
-  StringMap<uint32_t> NSL = IS1.getNamedStreams().getStringMap();
-  StringMap<uint32_t> NSR = IS2.getNamedStreams().getStringMap();
+  StringMap<uint32_t> NSL = IS1.getNamedStreams().entries();
+  StringMap<uint32_t> NSR = IS2.getNamedStreams().entries();
   D.diffUnorderedMap<EquivalentDiffProvider>("Named Stream", NSL, NSR);
   return Error::success();
 }

Modified: llvm/trunk/unittests/DebugInfo/PDB/HashTableTest.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/DebugInfo/PDB/HashTableTest.cpp?rev=325386&r1=325385&r2=325386&view=diff
==============================================================================
--- llvm/trunk/unittests/DebugInfo/PDB/HashTableTest.cpp (original)
+++ llvm/trunk/unittests/DebugInfo/PDB/HashTableTest.cpp Fri Feb 16 12:46:04 2018
@@ -8,6 +8,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "llvm/DebugInfo/PDB/Native/HashTable.h"
+#include "llvm/DebugInfo/PDB/Native/NamedStreamMap.h"
 #include "llvm/Support/BinaryByteStream.h"
 #include "llvm/Support/BinaryStreamReader.h"
 #include "llvm/Support/BinaryStreamWriter.h"
@@ -166,3 +167,43 @@ TEST(HashTableTest, Serialization) {
   EXPECT_EQ(Table.Present, Table2.Present);
   EXPECT_EQ(Table.Deleted, Table2.Deleted);
 }
+
+TEST(HashTableTest, NamedStreamMap) {
+  std::vector<StringRef> Streams = {"One",  "Two", "Three", "Four",
+                                    "Five", "Six", "Seven"};
+  StringMap<uint32_t> ExpectedIndices;
+  for (uint32_t I = 0; I < Streams.size(); ++I)
+    ExpectedIndices[Streams[I]] = I + 1;
+
+  // To verify the hash table actually works, we want to verify that insertion
+  // order doesn't matter.  So try inserting in every possible order of 7 items.
+  do {
+    NamedStreamMap NSM;
+    for (StringRef S : Streams)
+      NSM.set(S, ExpectedIndices[S]);
+
+    EXPECT_EQ(Streams.size(), NSM.size());
+
+    uint32_t N;
+    EXPECT_TRUE(NSM.get("One", N));
+    EXPECT_EQ(1, N);
+
+    EXPECT_TRUE(NSM.get("Two", N));
+    EXPECT_EQ(2, N);
+
+    EXPECT_TRUE(NSM.get("Three", N));
+    EXPECT_EQ(3, N);
+
+    EXPECT_TRUE(NSM.get("Four", N));
+    EXPECT_EQ(4, N);
+
+    EXPECT_TRUE(NSM.get("Five", N));
+    EXPECT_EQ(5, N);
+
+    EXPECT_TRUE(NSM.get("Six", N));
+    EXPECT_EQ(6, N);
+
+    EXPECT_TRUE(NSM.get("Seven", N));
+    EXPECT_EQ(7, N);
+  } while (std::next_permutation(Streams.begin(), Streams.end()));
+}