[PATCH] D10674: Value profiling - patchset 3
Betul Buyukkurt via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 27 10:59:41 PDT 2015
>>>> DataBuffer.push_back(InstrProfRecord(K, Hash,
>>> std::move(CounterBuffer)));
>>>> +
>>>> + // Read value profiling data
>>>> + for (uint32_t Kind = instrprof_value_kind::first;
>>>> + Kind < instrprof_value_kind::size; ++Kind) {
>>
>>> I still think we should store the kind in the data here and then read
>> the following data based on that. There's no guarantee that all kinds
>> that
>> aren't indirect_call_target will behave as they do in this loop as is,
>> so
>> I'd rather just reject outright kinds we don't understand. If the type
>> is
>> in the data we can do that, and it also saves a few bytes for functions
>> that don't have any value profiling data (and more bytes when we support
>> more kinds of value profiling).
>>
>> I do not think this is possible. The format indicates that multiple hash
>> values and data may be stored per name entry. Given that there is no
>> indicator in the format that tells whether the next bytes after counter
>> values would belong to a value kind or a hash value for the next data
>> for
>> the same key.
>>
>
> Sorry I missed this one. This seems to be the last main thing to be
> addressed/resolved/agreed upon for the indexed format.
>
> I think Justin's suggestion is possible. We just need one mandatory
> field to be emitted before the value profile data.
> value_profile_section:
> .quad NUM_VALUE_KIND
>
> This field indicates how many value profile kinds have profile data.
> When value profile data are missing, this format has storage
> advantage. For instance, when the function has no value profile data,
> The value profile section is just
>
> value_profile_section:
> .quad 0
>
> With Betuls patch, it will be (assuming 2 kinds)
>
> value_profile_section:
> .quad 0 // num of sites for kind 1 is 0
> .quad 0 // num of sites for kind 2 is 0
>
> However if value profile data exists for all value kinds, Betul's
> suggested format is slightly better -- it does not need to store the
> mandatory NUM_VALUE_KIND field, nor the value kind field it self.
>
> Since I believe that value profile data will be sparse (especially
> when we add more kinds in the future), my vote is on Justin's format
> which is more flexible.
I'll make sure to address this item as well. I should respond in a day or so.
Thanks,
-Betul
>
> David
>
>
>
>>> This also has the benefit that the V2 data will be valid for V3, so we
>> might even be able to get away without adding branches to check the
>> version difference here, which would be a convenient simplification. I
>> haven't thought through if that actually works though.
>>
>> ReadData can still read V2, therefore I do not understand why it should
>> not be valid.
>>
>>>> + // Read number of value sites.
>>>> + if (D + sizeof(uint32_t) > End)
>>>> + return data_type();
>>>> + uint32_t ValueSiteCount = endian::readNext<uint32_t, little,
>>> unaligned>(D);
>>>> +
>>>> + for (uint32_t VSite = 0; VSite < ValueSiteCount; ++VSite) { +
>> // Read number of value data pairs at value site.
>>>> + if (D + sizeof(uint32_t) > End)
>>>> + return data_type();
>>>> + uint32_t ValueDataCount = endian::readNext<uint32_t, little,
>>> unaligned>(D);
>>>> +
>>>> + InstrProfValueSiteRecord VSiteRecord;
>>>> + for (uint32_t VCount = 0; VCount < ValueDataCount; ++VCount)
>>>> {
>> + if (D + 2 * sizeof(uint64_t) > End)
>>>> + return data_type();
>>>> + uint64_t Value = endian::readNext<uint64_t, little,
>>> unaligned>(D);
>>>> + if (Kind == instrprof_value_kind::indirect_call_target) { +
>> auto Result = HashKeyMap.find(Value);
>>>> + if (Result != HashKeyMap.end())
>>>> + Value = (uint64_t)Result->second;
>>
>>> What if it isn't found? As I mentioned earlier, I'm not very
>>> comfortable
>> with this approach to storing the strings.
>>
>> I'm not storing additional string in the file. I'm reading the existing
>> set of keys, add them to the string table and create a map from hash
>> values to const char*'s to string table entries.
>>
>>>> + }
>>>> + uint64_t NumTaken = endian::readNext<uint64_t, little,
>>> unaligned>(D);
>>>> + VSiteRecord.ValueData.push_back(std::make_pair(Value,
>>> NumTaken));
>>>> + }
>>>> +
>>> DataBuffer[DataBuffer.size()-1].ValueSites[Kind].push_back(VSiteRecord);
>> This is better written as DataBuffer.back(). It also might make sense to
>> sort these here rather than in mergeValueData - more on that later.
>>
>> Done.
>>
>>>> + }
>>>> + }
>>>> }
>>>> return DataBuffer;
>>>> }
>>>> @@ -379,19 +444,24 @@
>>>> if (HashType > IndexedInstrProf::HashT::Last)
>>>> return error(instrprof_error::unsupported_hash_type);
>>>> uint64_t HashOffset = endian::readNext<uint64_t, little,
>>> unaligned>(Cur);
>>>> -
>>>> // The rest of the file is an on disk hash table.
>>>> Index.reset(InstrProfReaderIndex::Create(
>>>> Start + HashOffset, Cur, Start,
>>>> - InstrProfLookupTrait(HashType, FormatVersion)));
>>>> + InstrProfLookupTrait(HashType, FormatVersion, HashKeyMap))); +
>> for (auto Key : Index->keys()) {
>>>> + const char* KeyTableRef = StringTable.insertString(Key);
>>>> + HashKeyMap.insert(std::make_pair(ComputeHash(HashType, Key),
>>> KeyTableRef));
>>>> + }
>>>> // Set up our iterator for readNextRecord.
>>>> RecordIterator = Index->data_begin();
>>>> return success();
>>>> }
>>>> std::error_code IndexedInstrProfReader::getFunctionCounts(
>>>> - StringRef FuncName, uint64_t FuncHash, std::vector<uint64_t>
>>> &Counts) {
>>>> + StringRef FuncName, uint64_t FuncHash,
>>>> + std::vector<uint64_t> &Counts) {
>>>> +
>>>> auto Iter = Index->find(FuncName);
>>>> if (Iter == Index->end())
>>>> return error(instrprof_error::unknown_function);
>>>> @@ -411,6 +481,29 @@
>>>> return error(instrprof_error::hash_mismatch);
>>>> }
>>>> +std::error_code IndexedInstrProfReader::getFunctionValuesForKind( +
>> StringRef FuncName, uint64_t FuncHash, uint32_t ValueKind,
>>>> + std::vector<InstrProfValueSiteRecord> &Values) {
>>
>>> Does this function work? Won't it return an index into a string table
>> with no means to turn that into the actual string for indirect call
>> profiling?
>>
>> ValueData for indirect call targets store not indices but char*'s to
>> string table entries. The function works for that reason and I've
>> locally
>> verified that on my end. However, I've removed this function since it's
>> not in use at this time.
>>
>>>> +
>>>> + auto Iter = Index->find(FuncName);
>>>> + if (Iter == Index->end())
>>>> + return error(instrprof_error::unknown_function);
>>>> +
>>>> + // Found it. Look for counters with the right hash.
>>>> + ArrayRef<InstrProfRecord> Data = (*Iter);
>>>> + if (Data.empty())
>>>> + return error(instrprof_error::malformed);
>>>> +
>>>> + for (unsigned I = 0, E = Data.size(); I < E; ++I) {
>>>> + // Check for a match and fill the vector if there is one.
>>>> + if (Data[I].Hash == FuncHash) {
>>>> + Values = Data[I].ValueSites[ValueKind];
>>>> + return success();
>>>> + }
>>>> + }
>>>> + return error(instrprof_error::hash_mismatch);
>>>> +}
>>>> +
>>>> std::error_code
>>>> IndexedInstrProfReader::readNextRecord(InstrProfRecord &Record) {
>>>> // Are we out of records?
>>>> Index: lib/ProfileData/InstrProfIndexed.h
>>>> ===================================================================
>>>> ---
>> lib/ProfileData/InstrProfIndexed.h
>>>> +++ lib/ProfileData/InstrProfIndexed.h
>>>> @@ -47,7 +47,7 @@
>>>> }
>>>> const uint64_t Magic = 0x8169666f72706cff; // "\xfflprofi\x81"
>>>> -const uint64_t Version = 2;
>>>> +const uint64_t Version = 3;
>>>> const HashT HashType = HashT::MD5;
>>>> }
>>>> Index: lib/ProfileData/InstrProf.cpp
>>>> ===================================================================
>>>> ---
>> lib/ProfileData/InstrProf.cpp
>>>> +++ lib/ProfileData/InstrProf.cpp
>>>> @@ -50,6 +50,8 @@
>>>> return "Function count mismatch";
>>>> case instrprof_error::counter_overflow:
>>>> return "Counter overflow";
>>>> + case instrprof_error::value_site_count_mismatch:
>>>> + return "Function's value site counts mismatch";
>>>> }
>>>> llvm_unreachable("A value of instrprof_error has no message.");
>>>> }
>>>> Index: include/llvm/ProfileData/InstrProfWriter.h
>>>> ===================================================================
>>>> ---
>> include/llvm/ProfileData/InstrProfWriter.h
>>>> +++ include/llvm/ProfileData/InstrProfWriter.h
>>>> @@ -15,33 +15,30 @@
>>>> #ifndef LLVM_PROFILEDATA_INSTRPROFWRITER_H
>>>> #define LLVM_PROFILEDATA_INSTRPROFWRITER_H
>>>> -#include "llvm/ADT/ArrayRef.h"
>>>> #include "llvm/ADT/DenseMap.h"
>>>> #include "llvm/ADT/StringMap.h"
>>>> #include "llvm/ProfileData/InstrProf.h"
>>>> #include "llvm/Support/DataTypes.h"
>>>> #include "llvm/Support/MemoryBuffer.h"
>>>> #include "llvm/Support/raw_ostream.h"
>>>> -#include <vector>
>>>> namespace llvm {
>>>> /// Writer for instrumentation based profile data.
>>>> class InstrProfWriter {
>>>> public:
>>>> - typedef SmallDenseMap<uint64_t, std::vector<uint64_t>, 1>
>>> CounterData;
>>>> + typedef SmallDenseMap<uint64_t, InstrProfRecord, 1> ProfilingData;
>>>> private:
>>>> - StringMap<CounterData> FunctionData;
>>>> + InstrProfStringTable StringTable;
>>>> + StringMap<ProfilingData> FunctionData;
>>>> uint64_t MaxFunctionCount;
>>>> public:
>>>> InstrProfWriter() : MaxFunctionCount(0) {}
>>>> /// Add function counts for the given function. If there are
>>>> already
>>> counts
>>>> /// for this function and the hash and number of counts match, each
>>> counter is
>>>> /// summed.
>>>> - std::error_code addFunctionCounts(StringRef FunctionName,
>>>> - uint64_t FunctionHash,
>>>> - ArrayRef<uint64_t> Counters); +
>> std::error_code addRecord(InstrProfRecord &I);
>>
>>> I'm not a fan of dropping const here, but I think it can come back if
>>> we
>> move the updateStringTable method anyway.
>>
>> Same here. But, I couldn't find a way to get it to work otherwise.
>>
>>>> /// Write the profile to \c OS
>>>> void write(raw_fd_ostream &OS);
>>>> /// Write the profile, returning the raw data. For testing.
>>>> Index: include/llvm/ProfileData/InstrProfReader.h
>>>> ===================================================================
>>>> ---
>> include/llvm/ProfileData/InstrProfReader.h
>>>> +++ include/llvm/ProfileData/InstrProfReader.h
>>>> @@ -24,6 +24,7 @@
>>>> #include "llvm/Support/MemoryBuffer.h"
>>>> #include "llvm/Support/OnDiskHashTable.h"
>>>> #include <iterator>
>>>> +#include <map>
>>>> namespace llvm {
>>>> @@ -145,7 +146,6 @@
>>>> const uint64_t CountersDelta;
>>>> const uint64_t NamesDelta;
>>>> };
>>>> -
>>>> bool ShouldSwapBytes;
>>>> uint64_t CountersDelta;
>>>> uint64_t NamesDelta;
>>>> @@ -195,10 +195,13 @@
>>>> std::vector<InstrProfRecord> DataBuffer;
>>>> IndexedInstrProf::HashT HashType;
>>>> unsigned FormatVersion;
>>>> + const std::map<uint64_t, const char*> &HashKeyMap;
>>>> public:
>>>> - InstrProfLookupTrait(IndexedInstrProf::HashT HashType, unsigned
>>> FormatVersion)
>>>> - : HashType(HashType), FormatVersion(FormatVersion) {}
>>>> + InstrProfLookupTrait(IndexedInstrProf::HashT HashType,
>>>> + unsigned FormatVersion, std::map<uint64_t, const char*>
>>> &HashKeyMap)
>>>> + : HashType(HashType), FormatVersion(FormatVersion),
>>>> + HashKeyMap(HashKeyMap) {}
>>>> typedef ArrayRef<InstrProfRecord> data_type;
>>>> @@ -209,6 +212,7 @@
>>>> static bool EqualKey(StringRef A, StringRef B) { return A == B; }
>> static StringRef GetInternalKey(StringRef K) { return K; }
>>>> + static StringRef GetExternalKey(StringRef K) { return K; }
>>>> hash_value_type ComputeHash(StringRef K);
>>>> @@ -224,7 +228,17 @@
>>>> return StringRef((const char *)D, N);
>>>> }
>>>> - data_type ReadData(StringRef K, const unsigned char *D, offset_type
>>> N);
>>>> + data_type ReadDataV1V2(StringRef K, const unsigned char *D,
>>> offset_type N);
>>>> + data_type ReadDataV3(StringRef K, const unsigned char *D,
>> offset_type
>>> N);
>>>> +
>>>> + data_type ReadData(StringRef K, const unsigned char *D, offset_type
>>> N) {
>>>> + switch (FormatVersion) {
>>>> + case 1:
>>>> + case 2: return ReadDataV1V2(K, D, N);
>>>> + case 3: return ReadDataV3(K, D, N);
>>>> + }
>>>> + return data_type();
>>>> + }
>>>> };
>>>> typedef OnDiskIterableChainedHashTable<InstrProfLookupTrait>
>>>> @@ -243,6 +257,10 @@
>>>> uint64_t FormatVersion;
>>>> /// The maximal execution count among all functions.
>>>> uint64_t MaxFunctionCount;
>>>> + /// String table.
>>>> + InstrProfStringTable StringTable;
>>>> + /// Map of hash values to const char* keys in profiling data. +
>> std::map<uint64_t, const char*> HashKeyMap;
>>>> IndexedInstrProfReader(const IndexedInstrProfReader &) = delete;
>> IndexedInstrProfReader &operator=(const IndexedInstrProfReader &) =
>>> delete;
>>>> @@ -261,6 +279,13 @@
>>>> /// Fill Counts with the profile data for the given function name.
>> std::error_code getFunctionCounts(StringRef FuncName, uint64_t
>>> FuncHash,
>>>> std::vector<uint64_t> &Counts);
>>>> +
>>>> + /// Return value profile data for the given function name and hash
>>> and
>>>> + /// value profiling kind
>>>> + std::error_code getFunctionValuesForKind(StringRef FuncName, +
>> uint64_t FuncHash, uint32_t ValueKind,
>>>> + std::vector<InstrProfValueSiteRecord> &Values);
>>>> +
>>>> /// Return the maximum of all known function counts.
>>>> uint64_t getMaximumFunctionCount() { return MaxFunctionCount; }
>>>> Index: include/llvm/ProfileData/InstrProf.h
>>>> ===================================================================
>>>> ---
>> include/llvm/ProfileData/InstrProf.h
>>>> +++ include/llvm/ProfileData/InstrProf.h
>>>> @@ -16,10 +16,11 @@
>>>> #ifndef LLVM_PROFILEDATA_INSTRPROF_H_
>>>> #define LLVM_PROFILEDATA_INSTRPROF_H_
>>>> -#include "llvm/ADT/StringRef.h"
>>>> +#include "llvm/ADT/StringSet.h"
>>>> #include <cstdint>
>>>> -#include <system_error>
>>>> +#include <list>
>>>> #include <vector>
>>>> +#include <system_error>
>>>> namespace llvm {
>>>> const std::error_category &instrprof_category();
>>>> @@ -37,21 +38,99 @@
>>>> unknown_function,
>>>> hash_mismatch,
>>>> count_mismatch,
>>>> - counter_overflow
>>>> + counter_overflow,
>>>> + value_site_count_mismatch
>>>> };
>>>> inline std::error_code make_error_code(instrprof_error E) {
>>>> return std::error_code(static_cast<int>(E), instrprof_category());
>>>> }
>>>> +enum instrprof_value_kind : uint32_t {
>>>> + first = 0,
>>>> + indirect_call_target = 0,
>>>> + size = 1
>>>> +};
>>>> +
>>>> +struct InstrProfStringTable {
>>>> + // Set of string values in profiling data.
>>>> + StringSet<> StringValueSet;
>>>> + InstrProfStringTable() { StringValueSet.clear(); }
>>>> + // Get a pointer to internal storage of a string in set
>>>> + const char* getStringData(StringRef Str) {
>>>> + auto Result = StringValueSet.find(Str);
>>>> + return (Result == StringValueSet.end()) ? nullptr :
>>> Result->first().data();
>>>> + }
>>>> + // Insert a string to StringTable
>>>> + const char* insertString(StringRef Str) {
>>>> + auto Result = StringValueSet.insert(Str);
>>>> + return Result.first->first().data();
>>>> + }
>>>> +};
>>> Is this type giving us any value over just using the StringSet<>
>> directly?
>>
>> This is returning const char*'s which is what's stored in the value data
>> entries for indirect call target data. So I found wrapping it like this
>> very useful.
>>
>>>> +
>>>> +struct InstrProfValueSiteRecord {
>>>> + // Typedef for a single TargetValue-NumTaken pair.
>>>> + typedef std::pair<uint64_t, uint64_t> ValueDataPair;
>>>> + // Value profiling data pairs at a given value site.
>>>> + std::list<ValueDataPair> ValueData;
>>>> +
>>>> + InstrProfValueSiteRecord() { ValueData.clear(); }
>>>> + InstrProfValueSiteRecord(const InstrProfValueSiteRecord &Rec) +
>> : ValueData(std::move(Rec.ValueData)) {}
>>>> +
>>>> + // Sort ValueData ascending by TargetValue
>>>> + void sortByTargetValues() {
>>>> + ValueData.sort([](const ValueDataPair &left, const ValueDataPair
>>> &right)
>>>> + { return left.first < right.first; });
>>> clang-format and follow LLVM convention for variable names please.
>>
>> Done.
>>
>>>> + }
>>>> + // Sort ValueData descending by NumTaken
>>>> + void sortByNumTaken() {
>>>> + ValueData.sort([](const ValueDataPair &left, const ValueDataPair
>>> &right)
>>>> + { return left.second > right.second; });
>>>> + }
>>> This seems to be unused.
>>
>> Removed and will be re-introduced w/ the rest of the patches.
>>
>>>> + // Merge data from another InstrProfValueSiteRecord
>>>> + void mergeValueData(InstrProfValueSiteRecord &Input) {
>>>> + this->sortByTargetValues();
>>>> + Input.sortByTargetValues();
>>> It seems wasteful to re-sort these all the time. Would it make sense to
>> sort at read time, or possibly even canonicalize the on-disk order to
>> already be sorted?
>>
>> I left it as is at this time. Once the API is provided one can not
>> guarantee the future uses of it. So using the sorting here guarantees
>> that
>> future possible usages of the API's will not break this merge
>> implementation.
>>
>>> I guess it might make sense for the on-disk order to be sorted by
>> num-taken if that's what the frontend will want, and since merging is
>> less
>> common the sort cost is okay. In any case, we should probably set it up
>> so
>> that the on disk format is always sorted in a particular way - that will
>> remove some non-determinism and we can use it to speed up whichever
>> operation we think is more important.
>>
>> I'll make sure the on disk format is always sorted.
>>
>>>> + auto I = ValueData.begin();
>>>> + auto J = Input.ValueData.begin();
>>>> + while (J != Input.ValueData.end()) {
>>>> + while (I != ValueData.end() && I->first < J->first)
>>>> + ++I;
>>>> + if (I != ValueData.end() && I->first == J->first) {
>>>> + I->second += J->second;
>>>> + ++I;
>>>> + ++J;
>>>> + continue;
>>>> + }
>>>> + ValueData.insert(I, *J);
>>>> + ++J;
>>>> + }
>>> I suspect a for-loop would read better here. Also please store the
>> ".end()" values in variables rather than re-evaluating.
>>
>> Done.
>>
>>>> + }
>>>> +};
>>>> +
>>>> /// Profiling information for a single function.
>>>> struct InstrProfRecord {
>>>> InstrProfRecord() {}
>>>> InstrProfRecord(StringRef Name, uint64_t Hash,
>>>> std::vector<uint64_t>
>>> Counts)
>>>> : Name(Name), Hash(Hash), Counts(std::move(Counts)) {}
>>>> StringRef Name;
>>>> uint64_t Hash;
>>>> std::vector<uint64_t> Counts;
>>>> + // Size of vector indicates the number of value sites for a value
>>> kind
>>>> + std::vector<InstrProfValueSiteRecord>
>>> ValueSites[instrprof_value_kind::size];
>>> I don't think we're gaining much by having this be an array - I was
>> thinking it would be more like
>>> std::vector<InstrProfValueSiteRecord> IndirectCalls;
>>> Then when we add more value types, they can have their own variables
>>> and
>> be accessed directly. Most of the code that works with these will have a
>> particular kind in mind, and since the value data is dependent on kind
>> looping over these isn't generally that useful. That is, the looping we
>> have now is only in the reader and writer, and I can't see the users of
>> the data ever doing that.
>>
>> I think, I'm leaning towards keeping an array of kinds here.
>>
>>> For the reader and writer, a switch statement over the kinds will allow
>> us to warn if someone doesn't update somewhere when they add a new kind.
>> For the users of profile data, Data->IndirectCalls reads a lot better
>> than
>> Data->ValueSites[instrprof_value_kind::indirect_call_target].
>>
>> I'll revisit my latest patch to allow for that.
>>
>>> It really seems like the right trade off to me.
>>>> + // Clear value data entries
>>>> + void clearValueData() {
>>>> + for (uint32_t Kind = instrprof_value_kind::first;
>>>> + Kind < instrprof_value_kind::size; ++Kind)
>>>> + ValueSites[Kind].clear();
>>>> + }
>>
>>> This is never called, it can be removed.
>>
>> Done.
>>
>>>> + void updateStringTableReferences(InstrProfStringTable &StrTable) {
>>>> +
>> Name = StrTable.insertString(Name);
>>>> + for (auto VSite :
>>> ValueSites[instrprof_value_kind::indirect_call_target])
>>>> + for (auto VData : VSite.ValueData)
>>>> + VData.first = (uint64_t)StrTable.insertString((const char
>>> *)VData.first);
>>>> + }
>>>> };
>>>> } // end namespace llvm
>>
>>
>>
>>
>>
>>
>
More information about the llvm-commits
mailing list