[PATCH] D10674: Value profiling - patchset 3
Xinliang David Li via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 11 00:29:37 PDT 2015
Regarding the StringTable usage.
1) Reader: I am not sure why you need the stringTable/StringSet for
the reader in the first place. For HashKeyMap, can you directly map
the hash val to the key string without doing the converting (by
inserting the key to the stringtab)? Is it used for string
merging/uniquing? Or the client of the ProfileReader needs to traverse
the set of strings some how?
2) Writer: the string reference update is needed because the string
memory in the reader will be reclaimed. However there might be a
better way to do so:
With this:
auto &ProfileDataMap = FunctionData[I.Name];
I.Name string gets duplicated in the StringMap as the new key. In
other words, it is possible to do the update the string refs in
InstrProfRecords (to point to the keys in stringmap) after all records
for one reader are added.
However I think we should save the trouble -- see below.
>>
>> The way the HashKeyMap works here seems a little off. Why does the
>> Reader own it? Only the trait uses it. Also, since it follows a strict
>> "fill then query" pattern, it's probably better to just use a vector
>> that we sort after filling and then binary search later.
>
> OnDiskIterableChainedHashTable is a member of the reader class. So the
> reader may conveniently iterate over its keys to collect the strings and
> form the hash table accordingly. On the other hand the LookUpTrait class
> seemed to me like a class helping after the appropriate key is located.
> Also, the Info helper class is described as "reading values from the hash
> table's payload and computes the hash for a given key".
Please refer to the thread about reducing profile data size. With that
implemented, the strings in the profile data will be gone. In other
words, the value profiler (indirect call) is responsible to
re-establish the hash to string mapping, not the Indexed profiler
reader itself. This means all the translation code in this patch
(from hash --> string in the reader, and string-->hash in the writer)
should be removed -- there will be no special handling of icall value
profile kind either.
David
>
>>
>>> + // Emit warning if a hash collision is detected.
>>> + if (Result.second == false)
>>> + DEBUG(dbgs() << "IndexedInstrProfReader: hash collision detected:
>> \n"
>>> + << "\t Map Entry(Hash, Key): " <<
>> Result.first->first
>>> + << ", " << Result.first->second << "\n"
>>> + << "\t New Entry(Hash, Key): " <<
>> ComputeHash(HashType, Key)
>>> + << ", " << Key << "\n");
>>
>> This is not "emitting a warning". This will only be printed if the host
>> compiler is built in debug mode, so it seems pretty pointless. Actually
>> emitting a proper warning from this point in the compiler might be kind
>> of tricky though.
>>
>>> + }
>>> // Set up our iterator for readNextRecord.
>>> RecordIterator = Index->data_begin();
>>>
>>> Index: lib/ProfileData/InstrProfIndexed.h
>>> ===================================================================
>>> --- lib/ProfileData/InstrProfIndexed.h
>>> +++ lib/ProfileData/InstrProfIndexed.h
>>> @@ -47,7 +47,7 @@
>>> }
>>>
>>> const uint64_t Magic = 0x8169666f72706cff; // "\xfflprofi\x81"
>>> -const uint64_t Version = 2;
>>> +const uint64_t Version = 3;
>>> const HashT HashType = HashT::MD5;
>>> }
>>>
>>> Index: lib/ProfileData/InstrProf.cpp
>>> ===================================================================
>>> --- lib/ProfileData/InstrProf.cpp
>>> +++ lib/ProfileData/InstrProf.cpp
>>> @@ -50,6 +50,8 @@
>>> return "Function count mismatch";
>>> case instrprof_error::counter_overflow:
>>> return "Counter overflow";
>>> + case instrprof_error::value_site_count_mismatch:
>>> + return "Function's value site counts mismatch";
>>> }
>>> llvm_unreachable("A value of instrprof_error has no message.");
>>> }
>>> Index: include/llvm/ProfileData/InstrProfWriter.h
>>> ===================================================================
>>> --- include/llvm/ProfileData/InstrProfWriter.h
>>> +++ include/llvm/ProfileData/InstrProfWriter.h
>>> @@ -15,33 +15,32 @@
>>> #ifndef LLVM_PROFILEDATA_INSTRPROFWRITER_H
>>> #define LLVM_PROFILEDATA_INSTRPROFWRITER_H
>>>
>>> -#include "llvm/ADT/ArrayRef.h"
>>> #include "llvm/ADT/DenseMap.h"
>>> -#include "llvm/ADT/StringMap.h"
>>> #include "llvm/ProfileData/InstrProf.h"
>>> #include "llvm/Support/DataTypes.h"
>>> #include "llvm/Support/MemoryBuffer.h"
>>> #include "llvm/Support/raw_ostream.h"
>>> -#include <vector>
>>>
>>> namespace llvm {
>>>
>>> /// Writer for instrumentation based profile data.
>>> class InstrProfWriter {
>>> public:
>>> - typedef SmallDenseMap<uint64_t, std::vector<uint64_t>, 1>
>> CounterData;
>>> + typedef SmallDenseMap<uint64_t, InstrProfRecord, 1> ProfilingData;
>>> +
>>> private:
>>> - StringMap<CounterData> FunctionData;
>>> + InstrProfStringTable StringTable;
>>> + StringMap<ProfilingData> FunctionData;
>>> uint64_t MaxFunctionCount;
>>> public:
>>> InstrProfWriter() : MaxFunctionCount(0) {}
>>>
>>> + /// Update string entries in profile data with references to
>> StringTable.
>>> + void updateStringTableReferences(InstrProfRecord &I);
>>> /// Add function counts for the given function. If there are already
>> counts
>>> /// for this function and the hash and number of counts match, each
>> counter is
>>> /// summed.
>>> - std::error_code addFunctionCounts(StringRef FunctionName,
>>> - uint64_t FunctionHash,
>>> - ArrayRef<uint64_t> Counters);
>>> + std::error_code addRecord(InstrProfRecord &&I);
>>> /// Write the profile to \c OS
>>> void write(raw_fd_ostream &OS);
>>> /// Write the profile, returning the raw data. For testing.
>>> Index: include/llvm/ProfileData/InstrProfReader.h
>>> ===================================================================
>>> --- include/llvm/ProfileData/InstrProfReader.h
>>> +++ include/llvm/ProfileData/InstrProfReader.h
>>> @@ -24,6 +24,7 @@
>>> #include "llvm/Support/MemoryBuffer.h"
>>> #include "llvm/Support/OnDiskHashTable.h"
>>> #include <iterator>
>>> +#include <map>
>>>
>>> namespace llvm {
>>>
>>> @@ -65,6 +66,9 @@
>>> InstrProfIterator end() { return InstrProfIterator(); }
>>>
>>> protected:
>>> + /// String table for holding a unique copy of all the strings in the
>> profile.
>>> + InstrProfStringTable StringTable;
>>> +
>>> /// Set the current std::error_code and return same.
>>> std::error_code error(std::error_code EC) {
>>> LastError = EC;
>>> @@ -195,10 +199,13 @@
>>> std::vector<InstrProfRecord> DataBuffer;
>>> IndexedInstrProf::HashT HashType;
>>> unsigned FormatVersion;
>>> + const std::map<uint64_t, const char *> &HashKeyMap;
>>>
>>> public:
>>> - InstrProfLookupTrait(IndexedInstrProf::HashT HashType, unsigned
>> FormatVersion)
>>> - : HashType(HashType), FormatVersion(FormatVersion) {}
>>> + InstrProfLookupTrait(IndexedInstrProf::HashT HashType, unsigned
>> FormatVersion,
>>> + std::map<uint64_t, const char *> &HashKeyMap)
>>> + : HashType(HashType), FormatVersion(FormatVersion),
>>> + HashKeyMap(HashKeyMap) {}
>>>
>>> typedef ArrayRef<InstrProfRecord> data_type;
>>>
>>> @@ -209,6 +216,7 @@
>>>
>>> static bool EqualKey(StringRef A, StringRef B) { return A == B; }
>>> static StringRef GetInternalKey(StringRef K) { return K; }
>>> + static StringRef GetExternalKey(StringRef K) { return K; }
>>
>> What do you need GetExternalKey for?
>>
>>>
>>> hash_value_type ComputeHash(StringRef K);
>>>
>>> @@ -224,6 +232,8 @@
>>> return StringRef((const char *)D, N);
>>> }
>>>
>>> + bool ReadValueProfilingData(const unsigned char *&D,
>>> + const unsigned char *const End);
>>> data_type ReadData(StringRef K, const unsigned char *D, offset_type
>> N);
>>> };
>>>
>>> @@ -243,6 +253,8 @@
>>> uint64_t FormatVersion;
>>> /// The maximal execution count among all functions.
>>> uint64_t MaxFunctionCount;
>>> + /// Map of hash values to const char* keys in profiling data.
>>> + std::map<uint64_t, const char *> HashKeyMap;
>>>
>>> IndexedInstrProfReader(const IndexedInstrProfReader &) = delete;
>>> IndexedInstrProfReader &operator=(const IndexedInstrProfReader &) =
>> delete;
>>> Index: include/llvm/ProfileData/InstrProf.h
>>> ===================================================================
>>> --- include/llvm/ProfileData/InstrProf.h
>>> +++ include/llvm/ProfileData/InstrProf.h
>>> @@ -16,42 +16,102 @@
>>> #ifndef LLVM_PROFILEDATA_INSTRPROF_H_
>>> #define LLVM_PROFILEDATA_INSTRPROF_H_
>>>
>>> -#include "llvm/ADT/StringRef.h"
>>> +#include "llvm/ADT/StringSet.h"
>>> #include <cstdint>
>>> +#include <list>
>>> #include <system_error>
>>> #include <vector>
>>>
>>> namespace llvm {
>>> const std::error_category &instrprof_category();
>>>
>>> enum class instrprof_error {
>>> - success = 0,
>>> - eof,
>>> - bad_magic,
>>> - bad_header,
>>> - unsupported_version,
>>> - unsupported_hash_type,
>>> - too_large,
>>> - truncated,
>>> - malformed,
>>> - unknown_function,
>>> - hash_mismatch,
>>> - count_mismatch,
>>> - counter_overflow
>>> + success = 0,
>>> + eof,
>>> + bad_magic,
>>> + bad_header,
>>> + unsupported_version,
>>> + unsupported_hash_type,
>>> + too_large,
>>> + truncated,
>>> + malformed,
>>> + unknown_function,
>>> + hash_mismatch,
>>> + count_mismatch,
>>> + counter_overflow,
>>> + value_site_count_mismatch
>>> };
>>>
>>> inline std::error_code make_error_code(instrprof_error E) {
>>> return std::error_code(static_cast<int>(E), instrprof_category());
>>> }
>>>
>>> +enum instrprof_value_kind : uint32_t {
>>> + first = 0,
>>> + indirect_call_target = 0,
>>> + size = 1
>>> +};
>>> +
>>> +struct InstrProfStringTable {
>>> + // Set of string values in profiling data.
>>> + StringSet<> StringValueSet;
>>> + InstrProfStringTable() { StringValueSet.clear(); }
>>> + // Get a pointer to internal storage of a string in set
>>> + const char *getStringData(StringRef Str) {
>>> + auto Result = StringValueSet.find(Str);
>>> + return (Result == StringValueSet.end()) ? nullptr :
>> Result->first().data();
>>> + }
>>> + // Insert a string to StringTable
>>> + const char *insertString(StringRef Str) {
>>> + auto Result = StringValueSet.insert(Str);
>>> + return Result.first->first().data();
>>> + }
>>> +};
>>> +
>>> +struct InstrProfValueSiteRecord {
>>> + // Typedef for a single TargetValue-NumTaken pair.
>>> + typedef std::pair<uint64_t, uint64_t> ValueDataPair;
>>> + // Value profiling data pairs at a given value site.
>>> + std::list<ValueDataPair> ValueData;
>>> +
>>> + InstrProfValueSiteRecord() { ValueData.clear(); }
>>> +
>>> + // Sort ValueData ascending by TargetValue
>>> + void sortByTargetValues() {
>>> + ValueData.sort([](const ValueDataPair &left, const ValueDataPair
>> &right) {
>>> + return left.first < right.first;
>>> + });
>>> + }
>>> + // Merge data from another InstrProfValueSiteRecord
>>> + void mergeValueData(InstrProfValueSiteRecord &Input) {
>>> + this->sortByTargetValues();
>>> + Input.sortByTargetValues();
>>> + auto I = ValueData.begin();
>>> + auto IE = ValueData.end();
>>> + for (auto J = Input.ValueData.begin(), JE = Input.ValueData.end();
>> J != JE;
>>> + ++J) {
>>> + while (I != IE && I->first < J->first)
>>> + ++I;
>>> + if (I != IE && I->first == J->first) {
>>> + I->second += J->second;
>>> + ++I;
>>> + continue;
>>> + }
>>> + ValueData.insert(I, *J);
>>> + }
>>> + }
>>> +};
>>> +
>>> /// Profiling information for a single function.
>>> struct InstrProfRecord {
>>> InstrProfRecord() {}
>>> InstrProfRecord(StringRef Name, uint64_t Hash, std::vector<uint64_t>
>> Counts)
>>> : Name(Name), Hash(Hash), Counts(std::move(Counts)) {}
>>> StringRef Name;
>>> uint64_t Hash;
>>> std::vector<uint64_t> Counts;
>>> + // Size of vector indicates the number of value sites for a value
>> kind
>>> + std::vector<InstrProfValueSiteRecord>
>> ValueSites[instrprof_value_kind::size];
>>> };
>>>
>>> } // end namespace llvm
>>
>>
>
>
More information about the llvm-commits
mailing list