[PATCH] D147740: [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map

Snehasish Kumar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 9 16:06:19 PDT 2023


snehasish added inline comments.


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:324
+  uint64_t Hash;
+  if (F.getAsInteger(10, Hash))
+    Hash = MD5Hash(F);
----------------
Is there a chance that the base 10 encoding may change? Is this the only place where we generate the hashes?


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:1288-1289
+  // rather than returning a reference to an irrelevant entry, and pretend the
+  // old entry never existed. Dropping a sample should not affect compilation
+  // correctness.
+  template <typename... Ts>
----------------
Can this lead to non-deterministic builds?


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:1351
+  // Overloaded find() to lookup a function by name. This is called by IPO
+  // passes with an actual fucntion name, and it is possible that the profile
+  // reader converted function names in the profile to MD5 strings, so we need
----------------
typo "function"


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:526
 
-ErrorOr<StringRef> SampleProfileReaderBinary::readStringFromTable() {
+ErrorOr<StringRef> SampleProfileReaderBinary::readStringFromTable(size_t *Ret) {
   auto Idx = readStringIndex(NameTable);
----------------
Can we use a more descriptive name for the output parameters (here and elsewhere)? 


================
Comment at: llvm/unittests/tools/llvm-profdata/MD5CollisionTest.cpp:84
+
+    /// Samples
+    /// String1:10:1
----------------
Can we use the text format (with some additional helper functions) here instead of the binary data? It would be hard to update in case of changes in the future.


================
Comment at: llvm/unittests/tools/llvm-profdata/MD5CollisionTest.cpp:116
+  std::vector<StringRef> &NameTable = *Reader->getNameTable();
+  EXPECT_EQ(NameTable.size(), 2);
+  StringRef S1 = NameTable[0];
----------------
This should be assert since if it doesn't hold the following lines which deref NameTable[0] and NameTable[1] will segfault.


================
Comment at: llvm/unittests/tools/llvm-profdata/MD5CollisionTest.cpp:140
+  // Inserting S2 again should fail, returning the existing sample unchanged.
+  auto Ret = Profiles.try_emplace(S2, FunctionSamples());
+  EXPECT_FALSE(Ret.second);
----------------
Capture with structured binding here with clearer variable names to make it easier to read?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147740/new/

https://reviews.llvm.org/D147740



More information about the llvm-commits mailing list