[llvm] [llvm-profgen] Improve sample profile density (PR #92144)

Lei Wang via llvm-commits llvm-commits at lists.llvm.org
Fri May 17 11:35:02 PDT 2024


================
@@ -1032,6 +1035,78 @@ void CSProfileGenerator::convertToProfileMap() {
   IsProfileValidOnTrie = false;
 }
 
+void CSProfileGenerator::calculateAndShowDensity(
+    SampleContextTracker &CTracker) {
+  double Density = calculateDensity(CTracker);
+  showDensitySuggestion(Density, ProfileDensityHotFuncCutOff);
+}
+
+// Calculate Profile-density:
+// Sort the list of function-density in descending order and iterate them once
+// their accumulated total samples exceeds the percentage_threshold of total
+// profile samples, the profile-density is the last(minimum) function-density of
+// the processed functions, which means all the functions significant to perf
+// are on good density if the profile-density is good, or in other words, if the
+// profile-density is bad, the accumulated samples for all the bad density
+// profile exceeds the (100% - percentage_threshold).
+// The percentage_threshold(--profile-density-hot-func-cutoff) is configurable
+// depending on how much regression the system want to tolerate.
+double CSProfileGenerator::calculateDensity(SampleContextTracker &CTracker) {
+  double ProfileDensity = 0.0;
+
+  uint64_t TotalProfileSamples = 0;
+  // A list of the function profile density and total samples.
+  std::vector<std::pair<double, uint64_t>> DensityList;
+  for (const auto *Node : CTracker) {
+    const auto *FSamples = Node->getFunctionSamples();
+    if (!FSamples)
+      continue;
+
+    uint64_t TotalBodySamples = 0;
+    uint64_t FuncBodySize = 0;
+    for (const auto &I : FSamples->getBodySamples()) {
+      TotalBodySamples += I.second.getSamples();
+      FuncBodySize++;
+    }
+    // The whole function could be inlined and optimized out, use the callsite
+    // head samples instead to estimate the body count.
+    if (FuncBodySize == 0) {
----------------
wlei-llvm wrote:

Similar to the comment above, I want to use the block count first. One BB should only have one probe but could have many calls. so just to be consistent to use 
`density = (sum of BB count) / num of BB` 

otherwise, it would be 
`density =  (sum of BB count + sum of call count) / (num of BB + num of call` 
 
That said, it's probably the same thing, the call and probe in the same BB would have the same count and the division number should be close.  No strong option, as we don't have the instruction num, so we just need a new definition. 


https://github.com/llvm/llvm-project/pull/92144


More information about the llvm-commits mailing list