[PATCH] D116051: [InstrProf] Prevent duplicate functions in correlated data

Ellis Hoag via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 20 12:17:02 PST 2021


ellis created this revision.
Herald added a subscriber: hiraditya.
ellis edited the summary of this revision.
ellis added reviewers: kyulee, MaskRay.
ellis published this revision for review.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

When using debug info for profile correlation, avoid adding duplicate
functions in the synthetic Data section.

Before this patch, n duplicate function entries in the Data section would
cause counter values to be a factor of n larger. I built instrumented
clang with and without debug info correlation and got these summaries.

  # With Debug Info Correlate
  $ llvm-profdata show default.profdata
  Instrumentation level: IR  entry_first = 0
  Total functions: 182530
  Maximum function count: 52034
  Maximum internal block count: 5763
  
  # Without
  $ llvm-profdata show default.profdata
  Instrumentation level: IR  entry_first = 0
  Total functions: 183212
  Maximum function count: 52034
  Maximum internal block count: 5766

The slight difference in counts seem to be mostly from FileSystem and
Map functions and the difference in the number of instrumented functions
seems to come from missing debug info like destructors without source.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D116051

Files:
  llvm/include/llvm/ProfileData/InstrProfCorrelator.h
  llvm/lib/ProfileData/InstrProfCorrelator.cpp


Index: llvm/lib/ProfileData/InstrProfCorrelator.cpp
===================================================================
--- llvm/lib/ProfileData/InstrProfCorrelator.cpp
+++ llvm/lib/ProfileData/InstrProfCorrelator.cpp
@@ -127,8 +127,9 @@
 Error InstrProfCorrelatorImpl<IntPtrT>::correlateProfileData() {
   assert(Data.empty() && CompressedNames.empty() && Names.empty());
   correlateProfileDataImpl();
-  auto Result =
-      collectPGOFuncNameStrings(Names, /*doCompression=*/true, CompressedNames);
+  std::vector<std::string> NamesVec(Names.keys().begin(), Names.keys().end());
+  auto Result = collectPGOFuncNameStrings(NamesVec, /*doCompression=*/true,
+                                          CompressedNames);
   Names.clear();
   return Result;
 }
@@ -139,6 +140,7 @@
                                                 IntPtrT CounterOffset,
                                                 IntPtrT FunctionPtr,
                                                 uint32_t NumCounters) {
+  assert(!hasProbe(FunctionName));
   Data.push_back({
       maybeSwap<uint64_t>(IndexedInstrProf::ComputeHash(FunctionName)),
       maybeSwap<uint64_t>(CFGHash),
@@ -151,7 +153,12 @@
       maybeSwap<uint32_t>(NumCounters),
       /*NumValueSites=*/{maybeSwap<uint16_t>(0), maybeSwap<uint16_t>(0)},
   });
-  Names.push_back(FunctionName.str());
+  Names.insert(FunctionName);
+}
+
+template <class IntPtrT>
+bool InstrProfCorrelatorImpl<IntPtrT>::hasProbe(StringRef FunctionName) const {
+  return Names.contains(FunctionName);
 }
 
 template <class IntPtrT>
@@ -219,6 +226,8 @@
         if (auto EC =
                 AnnotationFormValue->getAsCString().moveInto(FunctionName))
           consumeError(std::move(EC));
+        if (this->hasProbe(*FunctionName))
+          return;
       } else if (AnnotationName.compare(
                      InstrProfCorrelator::CFGHashAttributeName) == 0) {
         CFGHash = AnnotationFormValue->getAsUnsignedConstant();
Index: llvm/include/llvm/ProfileData/InstrProfCorrelator.h
===================================================================
--- llvm/include/llvm/ProfileData/InstrProfCorrelator.h
+++ llvm/include/llvm/ProfileData/InstrProfCorrelator.h
@@ -12,6 +12,7 @@
 #ifndef LLVM_PROFILEDATA_INSTRPROFCORRELATOR_H
 #define LLVM_PROFILEDATA_INSTRPROFCORRELATOR_H
 
+#include "llvm/ADT/StringSet.h"
 #include "llvm/DebugInfo/DWARF/DWARFContext.h"
 #include "llvm/Object/Binary.h"
 #include "llvm/Object/ObjectFile.h"
@@ -105,11 +106,13 @@
   void addProbe(StringRef FunctionName, uint64_t CFGHash, IntPtrT CounterOffset,
                 IntPtrT FunctionPtr, uint32_t NumCounters);
 
+  bool hasProbe(StringRef FunctionName) const;
+
 private:
   InstrProfCorrelatorImpl(InstrProfCorrelatorKind Kind,
                           std::unique_ptr<InstrProfCorrelator::Context> Ctx)
       : InstrProfCorrelator(Kind, std::move(Ctx)){};
-  std::vector<std::string> Names;
+  llvm::StringSet<> Names;
 
   // Byte-swap the value if necessary.
   template <class T> T maybeSwap(T Value) const {


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D116051.395507.patch
Type: text/x-patch
Size: 3040 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211220/7e0100d0/attachment.bin>


More information about the llvm-commits mailing list