[PATCH] Value profiling - patchset 3

Thu Jul 9 00:47:21 PDT 2015

On Wed, Jul 8, 2015 at 12:39 PM, Betul Buyukkurt <betulb at codeaurora.org> wrote:
>> Betul Buyukkurt <betulb at codeaurora.org> writes:
>>> Hi bogner, davidxl, dnovillo, dsule, ivanbaev,
>>>
>>> Indexed reader/writer changes for version 3 i.e. going to be used by
>>> the value profiling infrastructure.
>>>
>>> http://reviews.llvm.org/D10674
>>>
>>> Files:
>>>   include/llvm/ProfileData/InstrProf.h
>>>   include/llvm/ProfileData/InstrProfReader.h
>>>   include/llvm/ProfileData/InstrProfWriter.h
>>>   lib/ProfileData/InstrProfIndexed.h
>>>   lib/ProfileData/InstrProfReader.cpp
>>>   lib/ProfileData/InstrProfWriter.cpp
>>>   tools/llvm-profdata/llvm-profdata.cpp
>>>   unittests/ProfileData/CoverageMappingTest.cpp
>>>   unittests/ProfileData/InstrProfTest.cpp
>>>
>>> EMAIL PREFERENCES
>>>   http://reviews.llvm.org/settings/panel/emailpreferences/
>>> Index: include/llvm/ProfileData/InstrProf.h
>>> ===================================================================
>>> --- include/llvm/ProfileData/InstrProf.h
>>> +++ include/llvm/ProfileData/InstrProf.h
>>> @@ -16,7 +16,8 @@
>>>  #ifndef LLVM_PROFILEDATA_INSTRPROF_H_
>>>  #define LLVM_PROFILEDATA_INSTRPROF_H_
>>>
>>> -#include "llvm/ADT/StringRef.h"
>>> +#include "llvm/ADT/ArrayRef.h"
>>> +#include "llvm/ADT/SmallString.h"
>>>  #include <cstdint>
>>>  #include <system_error>
>>>  #include <vector>
>>> @@ -44,14 +45,37 @@
>>>    return std::error_code(static_cast<int>(E), instrprof_category());
>>>  }
>>>
>>> +enum instrprof_value_kind : uint32_t {
>>> +    first = 0,
>>> +    indirect_call_target = 0,
>>> +    reserved_1 = 1,
>>> +    reserved_2 = 2,
>>> +    reserved_3 = 3,
>>> +    last = 3,
>>> +    size = 4
>>> +};
>>
>> We shouldn't need the reserved entries or first/last/size with my
>> proposed changes below, since we'll no longer be dependent on the number
>> of possible kinds.
>
> Thanks, will do.
>
>>> +
>>> +// Profiling information for a single target value
>>> +struct InstrProfValueRecord {
>>> +  InstrProfValueRecord() {};
>>> +  InstrProfValueRecord(std::string Name, uint64_t Value, uint64_t
>> NumTaken,
>>> +    uint32_t CounterIndex): Name(Name), Value(Value),
>> NumTaken(NumTaken),
>>> +    CounterIndex(CounterIndex) {}
>>> +  SmallString<32> Name;
>>> +  uint64_t Value;
>>> +  uint64_t NumTaken;
>>> +  uint32_t CounterIndex;
>>> +};
>>
>> What exactly are Name and Value here? They seem either redundant (in
>> that the Value is the address of the function called Name) or mutually
>> exclusive (in that we need one or the other).
>>
>> In general it's good to add more detail to doc comments to explain how a
>> struct like this is to be used.
>
> I'll add more comments. Name field holds the target function name for
> indirect call target profiling purposes. However, since this data
> structure was put together for generic value profiling, it may take other
> meanings for different value types. In the original implementation, we
> never stored the Value field.
>
> Value and Name fields are again mutually exclusive if indirect call target
> profiling was considered, but they value field by itself be meaningful for
> another value type.
>
> I may consider having Value and Name appear in a union. But that may
> require one to maintain another field for whether what union holds is a
> Value or a Name depending on the rest of the implementation.
>
>>> +
>>>  /// Profiling information for a single function.
>>>  struct InstrProfRecord {
>>>    InstrProfRecord() {}
>>>    InstrProfRecord(StringRef Name, uint64_t Hash, std::vector<uint64_t>
>> Counts)
>>>        : Name(Name), Hash(Hash), Counts(std::move(Counts)) {}
>>>    StringRef Name;
>>>    uint64_t Hash;
>>>    std::vector<uint64_t> Counts;
>>> +  std::vector<InstrProfValueRecord> Values[instrprof_value_kind::size];
>>
>> This ends up being a bit of an awkward data structure, both for storing
>> this data and for using it:
>
> This data structure is filled in by the readers. There is a separate data
> structure that is used by the indexed writer during merges. I moved it to
> the InstrProfWriter.h per review comments. However, in essence they do
> resemble one another.
>
>> - Clients need to dig deep into the structure and filter the vector of
>>   records just to do the most obvious query of "Get the indirect call
>>   targets for call site N in function F".
>>
>> - The variable length of the data (due to the string) makes the
>>   record-length logic for the on-disk hash table needlessly complicated.
>
> What do you refer to by the record-length logic? I believe I do not use
> record-length logic in the indexed reader V3. This is because the on-disk
> hash table is now modified to use a two level map. It does not put the
> counters from functions with same names but different hashes in one entry.

The disk format for value profile for the indexed profile in the
current patch is:

[value profile start]
  value_kind
  number_of_sites
[sites start]
  target_name_string_with_size (variable length)
  target_value
  target_count
  site_index_1

  target_name_string_with_size (variable length)
  target_value
  target_count
  site_index_1
  ....

While Justin's proposal is (which is closer to what we originally want:

[value profile start]
  value_kind
  number_of_sites
[sites start]
  num_of_targets (site 0)
  target_value
  target_count
  target_value
  target_count
   ....
  num_of_targets (site k)
  target_value
  target_count

For indirect call target kind, the target_value may be the string
offset of the name. Do need to figure out a way to support merging of
indexed profile data efficiently.

On the other hand, do you really need to store target string? How
about the name hash? One of the biggest size contributor in profile
data size is name strings, so we may do something in that area for
other uses of names in the profile data in the future too.

thanks,

David

>
>> - The poor alignment qualities (due both to the strings and the mix of
>>   32 and 64 bit members) make the storage inefficient.
>>
>> I think what we need to do here is push the names out into a separate
>> string table and group the records by index.
>
> Would this not make the merges more complicated? There is no guarantee
> that the string tables across two indexed profile data files will be
> identical. There will be plenty table look ups and searches w/in sites to
> find the right string table index to update by adding up the counts for
> the same targets in addition to the merging of the string tables on the
> fly. Currently there is a string compare per adding of the counts for the
> same targets.
>
>> I'm thinking an on-disk structure something like this would make sense:
>>
>>     function entry:
>>     * hash
>>     * number of counts
>>     * [counters]
>>     * [value profiling data]
>>
>>     value profiling datum:
>>     * kind
>>     * number of sites for kind
>>     * [sites]
>>
>>     site:
>>     * number of targets recorded
>>     * [target and count pairs]
>>
>> Basically:
>>
>> - This starts the same as now, with the function hash and the list of
>>   counters.
>>
>> - The value profiling data encodes a kind directly - if we only have
>>   kind "3" we don't need to encode 'zero' counts for kinds "1" and "2".
>>   This also means that the structure if there is no value profiling
>>   enabled is unchanged from before adding value profiling.
>>
>> - We have a list of value records per site, which means the record just
>>   needs to include the target and the count.
>>
>> Then "target" is an index into the separate string table where we keep
>> the names, or more generally an index into the table of values for the
>> particular kind of value profiling.
>
> I believe using tables for names and storing <table_index, count> pairs
> may complicate indexed profile data file merges.
>
>> I think this makes the code simpler to read and write, keeps the
>> generality and forwards compatibility of encoding the value profiling
>> kind, and will make accessing this data more efficient.
>>
>> WDYT?
>>
>>>  };
>>>
>>>  } // end namespace llvm
>>> Index: include/llvm/ProfileData/InstrProfReader.h
>>> ===================================================================
>>> --- include/llvm/ProfileData/InstrProfReader.h
>>> +++ include/llvm/ProfileData/InstrProfReader.h
>>> @@ -15,15 +15,16 @@
>>>  #ifndef LLVM_PROFILEDATA_INSTRPROFREADER_H
>>>  #define LLVM_PROFILEDATA_INSTRPROFREADER_H
>>>
>>> -#include "llvm/ADT/ArrayRef.h"
>>>  #include "llvm/ADT/StringExtras.h"
>>>  #include "llvm/ProfileData/InstrProf.h"
>>>  #include "llvm/Support/EndianStream.h"
>>>  #include "llvm/Support/ErrorOr.h"
>>>  #include "llvm/Support/LineIterator.h"
>>>  #include "llvm/Support/MemoryBuffer.h"
>>>  #include "llvm/Support/OnDiskHashTable.h"
>>>  #include <iterator>
>>> +#include <map>
>>> +#include <string>
>>
>> These look unused. Why add them?
>>
>>>
>>>  namespace llvm {
>>>
>>> @@ -224,7 +225,17 @@
>>>      return StringRef((const char *)D, N);
>>>    }
>>>
>>> -  data_type ReadData(StringRef K, const unsigned char *D, offset_type
>> N);
>>> +  data_type ReadDataV1V2(StringRef K, const unsigned char *D,
>> offset_type N);
>>> +  data_type ReadDataV3(StringRef K, const unsigned char *D, offset_type
>> N);
>>> +
>>> +  data_type ReadData(StringRef K, const unsigned char *D, offset_type
>> N) {
>>> +    switch (FormatVersion) {
>>> +      case 1:
>>> +      case 2: return ReadDataV1V2(K, D, N);
>>> +      case 3: return ReadDataV3(K, D, N);
>>> +    }
>>> +    return data_type();
>>> +  }
>>>  };
>>>
>>>  typedef OnDiskIterableChainedHashTable<InstrProfLookupTrait>
>>> Index: include/llvm/ProfileData/InstrProfWriter.h
>>> ===================================================================
>>> --- include/llvm/ProfileData/InstrProfWriter.h
>>> +++ include/llvm/ProfileData/InstrProfWriter.h
>>> @@ -15,33 +15,39 @@
>>>  #ifndef LLVM_PROFILEDATA_INSTRPROFWRITER_H
>>>  #define LLVM_PROFILEDATA_INSTRPROFWRITER_H
>>>
>>> -#include "llvm/ADT/ArrayRef.h"
>>>  #include "llvm/ADT/DenseMap.h"
>>>  #include "llvm/ADT/StringMap.h"
>>>  #include "llvm/ProfileData/InstrProf.h"
>>>  #include "llvm/Support/DataTypes.h"
>>>  #include "llvm/Support/MemoryBuffer.h"
>>>  #include "llvm/Support/raw_ostream.h"
>>> -#include <vector>
>>>
>>>  namespace llvm {
>>>
>>> +/// Profiling information for a single function for outputting through
>> the
>>> +/// IndexedInstrProfWriter
>>> +struct IndexedInstrProfRecord {
>>> +  IndexedInstrProfRecord() {}
>>> +  IndexedInstrProfRecord(std::vector<uint64_t> Counts)
>>> +      : Counts(std::move(Counts)) {}
>>> +  std::vector<uint64_t> Counts;
>>> +  std::vector<InstrProfValueRecord> Values[instrprof_value_kind::size];
>>> +};
>>> +
>>>  /// Writer for instrumentation based profile data.
>>>  class InstrProfWriter {
>>>  public:
>>> -  typedef SmallDenseMap<uint64_t, std::vector<uint64_t>, 1>
>> CounterData;
>>> +  typedef SmallDenseMap<uint64_t, IndexedInstrProfRecord, 1>
>> ProfilingData;
>>>  private:
>>> -  StringMap<CounterData> FunctionData;
>>> +  StringMap<ProfilingData> FunctionData;
>>>    uint64_t MaxFunctionCount;
>>>  public:
>>>    InstrProfWriter() : MaxFunctionCount(0) {}
>>>
>>>    /// Add function counts for the given function. If there are already
>> counts
>>>    /// for this function and the hash and number of counts match, each
>> counter is
>>>    /// summed.
>>> -  std::error_code addFunctionCounts(StringRef FunctionName,
>>> -                                    uint64_t FunctionHash,
>>> -                                    ArrayRef<uint64_t> Counters);
>>> +  std::error_code addRecord(const InstrProfRecord &I);
>>>    /// Write the profile to \c OS
>>>    void write(raw_fd_ostream &OS);
>>>    /// Write the profile, returning the raw data. For testing.
>>> Index: lib/ProfileData/InstrProfIndexed.h
>>> ===================================================================
>>> --- lib/ProfileData/InstrProfIndexed.h
>>> +++ lib/ProfileData/InstrProfIndexed.h
>>> @@ -47,7 +47,7 @@
>>>  }
>>>
>>>  const uint64_t Magic = 0x8169666f72706cff; // "\xfflprofi\x81"
>>> -const uint64_t Version = 2;
>>> +const uint64_t Version = 3;
>>>  const HashT HashType = HashT::MD5;
>>>  }
>>>
>>> Index: lib/ProfileData/InstrProfReader.cpp
>>> ===================================================================
>>> --- lib/ProfileData/InstrProfReader.cpp
>>> +++ lib/ProfileData/InstrProfReader.cpp
>>> @@ -15,6 +15,7 @@
>>>  #include "llvm/ProfileData/InstrProfReader.h"
>>>  #include "InstrProfIndexed.h"
>>>  #include "llvm/ADT/STLExtras.h"
>>> +#include "llvm/ProfileData/InstrProf.h"
>>>  #include <cassert>
>>>
>>>  using namespace llvm;
>>> @@ -302,8 +303,9 @@
>>>  typedef InstrProfLookupTrait::data_type data_type;
>>>  typedef InstrProfLookupTrait::offset_type offset_type;
>>>
>>> -data_type InstrProfLookupTrait::ReadData(StringRef K, const unsigned
>> char *D,
>>> -                                         offset_type N) {
>>> +data_type InstrProfLookupTrait::ReadDataV1V2(StringRef K,
>>> +                                             const unsigned char *D,
>>> +                                             offset_type N) {
>>>
>>>    // Check if the data is corrupt. If so, don't try to read it.
>>>    if (N % sizeof(uint64_t))
>>> @@ -342,6 +344,67 @@
>>>    return DataBuffer;
>>>  }
>>>
>>> +data_type InstrProfLookupTrait::ReadDataV3(StringRef K,
>>> +                                           const unsigned char *D,
>>> +                                           offset_type N) {
>>> +  using namespace support;
>>> +  const unsigned char *End = D + N;
>>> +  // Read number of data entries.
>>> +  if (D + sizeof (uint64_t) >= End)
>>> +    return data_type();
>>> +  uint64_t DataBufferSize = endian::readNext<uint64_t, little,
>> unaligned>(D);
>>> +  DataBuffer.clear();
>>> +  DataBuffer.reserve(DataBufferSize);
>>> +  std::vector<uint64_t> CounterBuffer;
>>> +  std::vector<InstrProfValueRecord> ValueBuffer;
>>> +  for (unsigned I = 0; I < DataBufferSize; ++I) {
>>> +    // Read hash and number of counters
>>> +    if (D + sizeof (uint64_t) + sizeof(uint32_t) >= End)
>>> +      return data_type();
>>> +    uint64_t Hash = endian::readNext<uint64_t, little, unaligned>(D);
>>> +    uint32_t NumCounts = endian::readNext<uint32_t, little,
>> unaligned>(D);
>>> +    // Read counter values
>>> +    if (D + NumCounts * sizeof(uint64_t) >= End)
>>> +      return data_type();
>>> +    CounterBuffer.clear();
>>> +    for (unsigned I = 0; I < NumCounts; ++I)
>>> +      CounterBuffer.push_back(endian::readNext<uint64_t, little,
>> unaligned>(D));
>>> +
>>> +    DataBuffer.push_back(InstrProfRecord(K, Hash, CounterBuffer));
>>
>> I think it will actually be simpler to keep a single ReadData instead of
>> splitting into V1V2 and V3 - V3 is essentially V1V2 with more data at
>> the end, and I can see here that the logic for the V1V2 data is nearly
>> identical in both.
>
> That may add in lots of if (version == x) sort of checks in the code. That
> unfortunately makes debugging, code reading for future updates
> problematic. The format is implicit in the readers but the writers are
> never maintained where the format is explicit.
>
>>> +
>>> +    // Read value profiling data
>>> +    for (uint32_t Kind = instrprof_value_kind::first;
>>> +         Kind < instrprof_value_kind::size; ++Kind) {
>>> +      // Read number of values for value kind
>>> +      if (D + sizeof(uint32_t) > End)
>>> +        return data_type();
>>> +      uint32_t NumValues = endian::readNext<uint32_t, little,
>> unaligned>(D);
>>> +      ValueBuffer.clear();
>>> +      for (unsigned I = 0; I < NumValues; ++I) {
>>> +        // Read size of the name string
>>> +        if (D + sizeof(uint32_t) >= End)
>>> +          return data_type();
>>> +        uint32_t NameSize = endian::readNext<uint32_t, little,
>> unaligned>(D);
>>> +        // Read name string
>>> +        if (D + NameSize * sizeof(char) >= End)
>>> +          return data_type();
>>> +        std::string Name((const char *)D, NameSize);
>>> +        D += NameSize;
>>> +        // Read raw value and numtaken and the counter index
>>> +        if (D + 2 * sizeof(uint64_t) + sizeof(uint32_t) > End)
>>> +          return data_type();
>>> +        uint64_t Value = endian::readNext<uint64_t, little,
>> unaligned>(D);
>>> +        uint64_t NumTaken = endian::readNext<uint64_t, little,
>> unaligned>(D);
>>> +        uint32_t Index = endian::readNext<uint32_t, little,
>> unaligned>(D);
>>> +        ValueBuffer.push_back(
>>> +            InstrProfValueRecord(Name, Value, NumTaken, Index));
>>> +      }
>>> +      DataBuffer[DataBuffer.size()-1].Values[Kind] = ValueBuffer;
>>> +    }
>>> +  }
>>> +  return DataBuffer;
>>> +}
>>> +
>>>  bool IndexedInstrProfReader::hasFormat(const MemoryBuffer &DataBuffer)
>> {
>>>    if (DataBuffer.getBufferSize() < 8)
>>>      return false;
>>> Index: lib/ProfileData/InstrProfWriter.cpp
>>> ===================================================================
>>> --- lib/ProfileData/InstrProfWriter.cpp
>>> +++ lib/ProfileData/InstrProfWriter.cpp
>>> @@ -26,8 +26,8 @@
>>>    typedef StringRef key_type;
>>>    typedef StringRef key_type_ref;
>>>
>>> -  typedef const InstrProfWriter::CounterData *const data_type;
>>> -  typedef const InstrProfWriter::CounterData *const data_type_ref;
>>> +  typedef const InstrProfWriter::ProfilingData *const data_type;
>>> +  typedef const InstrProfWriter::ProfilingData *const data_type_ref;
>>>
>>>    typedef uint64_t hash_value_type;
>>>    typedef uint64_t offset_type;
>>> @@ -44,9 +44,24 @@
>>>      offset_type N = K.size();
>>>      LE.write<offset_type>(N);
>>>
>>> -    offset_type M = 0;
>>> -    for (const auto &Counts : *V)
>>> -      M += (2 + Counts.second.size()) * sizeof(uint64_t);
>>> +    offset_type M = sizeof(uint64_t); // size of ProfileData
>>> +    for (const auto &ProfileData : *V) {
>>> +      M += sizeof(uint64_t); // size of ProfileData.first i.e. hash
>>> +      M += sizeof(uint32_t); // size of
>> ProfileData.second.Counts.size()
>>> +      M += ProfileData.second.Counts.size() * sizeof(uint64_t);
>>> +      for (uint32_t Kind = instrprof_value_kind::first;
>>> +           Kind < instrprof_value_kind::size; ++Kind) {
>>> +        M += sizeof(uint32_t); // size of
>> ProfileData.second.Values[Kind].size()
>>> +        for (InstrProfValueRecord I : ProfileData.second.Values[Kind])
>> {
>>> +          M += sizeof(uint32_t);  // size of I.Name
>>> +          M += I.Name.str().size() * sizeof(char);
>>> +          M += sizeof(uint64_t);  // size of I.Value
>>> +          M += sizeof(uint64_t);  // size of I.NumTaken
>>> +          M += sizeof(uint32_t);  // size of I.CounterIndex
>>> +        }
>>> +      }
>>> +    }
>>> +
>>>      LE.write<offset_type>(M);
>>>
>>>      return std::make_pair(N, M);
>>> @@ -61,51 +76,97 @@
>>>      using namespace llvm::support;
>>>      endian::Writer<little> LE(Out);
>>>
>>> -    for (const auto &Counts : *V) {
>>> -      LE.write<uint64_t>(Counts.first);
>>> -      LE.write<uint64_t>(Counts.second.size());
>>> -      for (uint64_t I : Counts.second)
>>> +    LE.write<uint64_t>(V->size());
>>> +    for (const auto &ProfileData : *V) {
>>> +      LE.write<uint64_t>(ProfileData.first);
>>> +      LE.write<uint32_t>(ProfileData.second.Counts.size());
>>> +      for (uint64_t I : ProfileData.second.Counts)
>>>          LE.write<uint64_t>(I);
>>> +      for (uint32_t Kind = instrprof_value_kind::first;
>>> +           Kind < instrprof_value_kind::size; ++Kind) {
>>> +        LE.write<uint32_t>(ProfileData.second.Values[Kind].size());
>>> +        for (InstrProfValueRecord I : ProfileData.second.Values[Kind])
>> {
>>> +          LE.write<uint32_t>(I.Name.str().size());
>>> +          Out.write(I.Name.c_str(), I.Name.str().size());
>>> +          LE.write<uint64_t>(I.Value);
>>> +          LE.write<uint64_t>(I.NumTaken);
>>> +          LE.write<uint32_t>(I.CounterIndex);
>>> +        }
>>> +      }
>>>      }
>>>    }
>>>  };
>>>  }
>>>
>>> -std::error_code
>>> -InstrProfWriter::addFunctionCounts(StringRef FunctionName,
>>> -                                   uint64_t FunctionHash,
>>> -                                   ArrayRef<uint64_t> Counters) {
>>> -  auto &CounterData = FunctionData[FunctionName];
>>> -
>>> -  auto Where = CounterData.find(FunctionHash);
>>> -  if (Where == CounterData.end()) {
>>> -    // We've never seen a function with this name and hash, add it.
>>> -    CounterData[FunctionHash] = Counters;
>>> -    // We keep track of the max function count as we go for simplicity.
>>> -    if (Counters[0] > MaxFunctionCount)
>>> -      MaxFunctionCount = Counters[0];
>>> -    return instrprof_error::success;
>>> -  }
>>> +static std::error_code combineInstrProfRecords(
>>> +  IndexedInstrProfRecord &Dest, const InstrProfRecord &Source,
>>> +  uint64_t &MaxFunctionCount) {
>>>
>>> -  // We're updating a function we've seen before.
>>> -  auto &FoundCounters = Where->second;
>>> -  // If the number of counters doesn't match we either have bad data or
>> a hash
>>> -  // collision.
>>> -  if (FoundCounters.size() != Counters.size())
>>> +  // If the number of counters doesn't match we either have bad data
>>> +  // or a hash collision.
>>> +  if (Dest.Counts.size() != Source.Counts.size())
>>>      return instrprof_error::count_mismatch;
>>>
>>> -  for (size_t I = 0, E = Counters.size(); I < E; ++I) {
>>> -    if (FoundCounters[I] + Counters[I] < FoundCounters[I])
>>> +  for (size_t I = 0, E = Source.Counts.size(); I < E; ++I) {
>>> +    if (Dest.Counts[I] + Source.Counts[I] < Dest.Counts[I])
>>>        return instrprof_error::counter_overflow;
>>> -    FoundCounters[I] += Counters[I];
>>> +    Dest.Counts[I] += Source.Counts[I];
>>> +  }
>>> +
>>> +  for (uint32_t Kind = instrprof_value_kind::first;
>>> +       Kind < instrprof_value_kind::size; ++Kind) {
>>> +    std::vector<InstrProfValueRecord>::const_iterator I =
>>> +      Source.Values[Kind].begin();
>>> +    for ( ; I != Source.Values[Kind].end(); ++I) {
>>> +      bool Found = false;
>>> +      std::vector<InstrProfValueRecord>::iterator J =
>> Dest.Values[Kind].begin();
>>> +      for ( ; J != Dest.Values[Kind].end(); ++J) {
>>> +        if (J->CounterIndex == I->CounterIndex) {
>>> +          if ((Kind == instrprof_value_kind::indirect_call_target) &&
>>> +              (0 == J->Name.compare(I->Name)))
>>> +            Found = true;
>>> +          else if (J->Value == I->Value)
>>> +            Found = true;
>>> +
>>> +          if (Found) {
>>> +            J->NumTaken += I->NumTaken;
>>> +            break;
>>> +          }
>>> +        }
>>> +      }
>>> +      if (!Found) Dest.Values[Kind].push_back(*I);
>>> +    }
>>>    }
>>> +
>>>    // We keep track of the max function count as we go for simplicity.
>>> -  if (FoundCounters[0] > MaxFunctionCount)
>>> -    MaxFunctionCount = FoundCounters[0];
>>> +  if (Dest.Counts[0] > MaxFunctionCount)
>>> +    MaxFunctionCount = Dest.Counts[0];
>>>
>>>    return instrprof_error::success;
>>>  }
>>>
>>> +std::error_code
>>> +InstrProfWriter::addRecord(const InstrProfRecord &I) {
>>> +  auto &ProfileDataMap = FunctionData[I.Name];
>>> +
>>> +  auto Where = ProfileDataMap.find(I.Hash);
>>> +  if (Where == ProfileDataMap.end()) {
>>> +    // We've never seen a function with this name and hash, add it.
>>> +    ProfileDataMap[I.Hash] = IndexedInstrProfRecord(I.Counts);
>>> +    for (uint32_t Kind = instrprof_value_kind::first;
>>> +         Kind < instrprof_value_kind::size; ++Kind)
>>> +      ProfileDataMap[I.Hash].Values[Kind] = I.Values[Kind];
>>> +
>>> +    // We keep track of the max function count as we go for simplicity.
>>> +    if (I.Counts[0] > MaxFunctionCount)
>>> +      MaxFunctionCount = I.Counts[0];
>>> +    return instrprof_error::success;
>>> +  }
>>> +
>>> +  // We're updating a function we've seen before.
>>> +  return combineInstrProfRecords(Where->second, I, MaxFunctionCount);
>>> +}
>>> +
>>>  std::pair<uint64_t, uint64_t> InstrProfWriter::writeImpl(raw_ostream
>> &OS) {
>>>    OnDiskChainedHashTableGenerator<InstrProfRecordTrait> Generator;
>>>
>>> Index: tools/llvm-profdata/llvm-profdata.cpp
>>> ===================================================================
>>> --- tools/llvm-profdata/llvm-profdata.cpp
>>> +++ tools/llvm-profdata/llvm-profdata.cpp
>>> @@ -59,8 +59,7 @@
>>>
>>>      auto Reader = std::move(ReaderOrErr.get());
>>>      for (const auto &I : *Reader)
>>> -      if (std::error_code EC =
>>> -              Writer.addFunctionCounts(I.Name, I.Hash, I.Counts))
>>> +      if (std::error_code EC = Writer.addRecord(I))
>>>          errs() << Filename << ": " << I.Name << ": " << EC.message() <<
>> "\n";
>>>      if (Reader->hasError())
>>>        exitWithError(Reader->getError().message(), Filename);
>>> Index: unittests/ProfileData/CoverageMappingTest.cpp
>>> ===================================================================
>>> --- unittests/ProfileData/CoverageMappingTest.cpp
>>> +++ unittests/ProfileData/CoverageMappingTest.cpp
>>> @@ -188,7 +188,8 @@
>>>  }
>>>
>>>  TEST_F(CoverageMappingTest, basic_coverage_iteration) {
>>> -  ProfileWriter.addFunctionCounts("func", 0x1234, {30, 20, 10, 0});
>>> +  InstrProfRecord Record("func", 0x1234, {30, 20, 10, 0});
>>> +  ProfileWriter.addRecord(Record);
>>>    readProfCounts();
>>>
>>>    addCMR(Counter::getCounter(0), "file1", 1, 1, 9, 9);
>>> @@ -238,7 +239,8 @@
>>>  }
>>>
>>>  TEST_F(CoverageMappingTest, combine_regions) {
>>> -  ProfileWriter.addFunctionCounts("func", 0x1234, {10, 20, 30});
>>> +  InstrProfRecord Record("func", 0x1234, {10, 20, 30});
>>> +  ProfileWriter.addRecord(Record);
>>>    readProfCounts();
>>>
>>>    addCMR(Counter::getCounter(0), "file1", 1, 1, 9, 9);
>>> @@ -256,7 +258,8 @@
>>>  }
>>>
>>>  TEST_F(CoverageMappingTest, dont_combine_expansions) {
>>> -  ProfileWriter.addFunctionCounts("func", 0x1234, {10, 20});
>>> +  InstrProfRecord Record("func", 0x1234, {10, 20});
>>> +  ProfileWriter.addRecord(Record);
>>>    readProfCounts();
>>>
>>>    addCMR(Counter::getCounter(0), "file1", 1, 1, 9, 9);
>>> @@ -275,7 +278,8 @@
>>>  }
>>>
>>>  TEST_F(CoverageMappingTest, strip_filename_prefix) {
>>> -  ProfileWriter.addFunctionCounts("file1:func", 0x1234, {10});
>>> +  InstrProfRecord Record("file1:func", 0x1234, {10});
>>> +  ProfileWriter.addRecord(Record);
>>>    readProfCounts();
>>>
>>>    addCMR(Counter::getCounter(0), "file1", 1, 1, 9, 9);
>>> Index: unittests/ProfileData/InstrProfTest.cpp
>>> ===================================================================
>>> --- unittests/ProfileData/InstrProfTest.cpp
>>> +++ unittests/ProfileData/InstrProfTest.cpp
>>> @@ -50,7 +50,8 @@
>>>  }
>>>
>>>  TEST_F(InstrProfTest, write_and_read_one_function) {
>>> -  Writer.addFunctionCounts("foo", 0x1234, {1, 2, 3, 4});
>>> +  InstrProfRecord Record("foo", 0x1234, {1, 2, 3, 4});
>>> +  Writer.addRecord(Record);
>>>    auto Profile = Writer.writeBuffer();
>>>    readProfile(std::move(Profile));
>>>
>>> @@ -67,13 +68,16 @@
>>>  }
>>>
>>>  TEST_F(InstrProfTest, get_function_counts) {
>>> -  Writer.addFunctionCounts("foo", 0x1234, {1, 2});
>>> -  Writer.addFunctionCounts("foo", 0x1235, {3, 4});
>>> +  InstrProfRecord Record1("foo", 0x1234, {1, 2});
>>> +  InstrProfRecord Record2("foo", 0x1235, {3, 4});
>>> +  Writer.addRecord(Record1);
>>> +  Writer.addRecord(Record2);
>>>    auto Profile = Writer.writeBuffer();
>>>    readProfile(std::move(Profile));
>>>
>>>    std::vector<uint64_t> Counts;
>>> -  ASSERT_TRUE(NoError(Reader->getFunctionCounts("foo", 0x1234,
>> Counts)));
>>> +  ASSERT_TRUE(NoError(
>>> +    Reader->getFunctionCounts("foo", 0x1234, Counts)));
>>>    ASSERT_EQ(2U, Counts.size());
>>>    ASSERT_EQ(1U, Counts[0]);
>>>    ASSERT_EQ(2U, Counts[1]);
>>> @@ -92,9 +96,12 @@
>>>  }
>>>
>>>  TEST_F(InstrProfTest, get_max_function_count) {
>>> -  Writer.addFunctionCounts("foo", 0x1234, {1ULL << 31, 2});
>>> -  Writer.addFunctionCounts("bar", 0, {1ULL << 63});
>>> -  Writer.addFunctionCounts("baz", 0x5678, {0, 0, 0, 0});
>>> +  InstrProfRecord Record1("foo", 0x1234, {1ULL << 31, 2});
>>> +  InstrProfRecord Record2("bar", 0, {1ULL << 63});
>>> +  InstrProfRecord Record3("baz", 0x5678, {0, 0, 0, 0});
>>> +  Writer.addRecord(Record1);
>>> +  Writer.addRecord(Record2);
>>> +  Writer.addRecord(Record3);
>>>    auto Profile = Writer.writeBuffer();
>>>    readProfile(std::move(Profile));
>>>
>>
>
>