[llvm] cef9b96 - [CSSPGO] Report zero-count probe in profile instead of dangling probes.

Hongtao Yu via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 16 11:46:02 PDT 2021


Author: Hongtao Yu
Date: 2021-06-16T11:45:29-07:00
New Revision: cef9b96b01b75fedea5e91ece776228f4088ba78

URL: https://github.com/llvm/llvm-project/commit/cef9b96b01b75fedea5e91ece776228f4088ba78
DIFF: https://github.com/llvm/llvm-project/commit/cef9b96b01b75fedea5e91ece776228f4088ba78.diff

LOG: [CSSPGO] Report zero-count probe in profile instead of dangling probes.

Previously dangling samples were represented by INT64_MAX in sample profile while probes never executed were not reported. This was based on an observation that dangling probes were only at a smaller portion than zero-count probes. However, with compiler optimizations, dangling probes end up becoming at large portion of all probes in general and reporting them does not make sense from profile size point of view. This change flips sample reporting by reporting zero-count probes instead. This enabled dangling probe to be represented by none (missing entry in profile). This has a couple benefits:

1. Reducing sample profile size in optimize mode, even when the number of non-executed probes outperform the number of dangling probes, since INT64_MAX takes more space over 0 to encode.

2. Binary size savings. No need to encode dangling probe anymore, since missing probes are treated as dangling in the profile reader.

3. Reducing compiler work to track dangling probes. However, for probes that are real dead and removed, we still need the compiler to identify them so that they can be reported as zero-count, instead of mistreated as dangling probes.

4. Improving counts quality by respecting the counts already collected on the non-dangling copy of a probe. A probe, when duplicated, gets two copies at runtime. If one of them is dangling while the other is not, merging the two probes at profile generation time will cause the real samples collected on the non-dangling one to be discarded. Not reporting the dangling counterpart will keep the real samples.

5. Better readability.

6. Be consistent with non-CS dwarf line number based profile. Zero counts are trusted by the compiler counts inferencer while missing counts will be inferred by the compiler.

Note that the current patch does include any work for #3. There will be follow-up changes.

For #1, I've seen for a large Facebook service, the text profile is reduced by 7%. For extbinary profile, the size of  LBRProfileSection is reduced by 35%.

For #4, I have seen general counts quality for SPEC2017 is improved by 10%.

Reviewed By: wenlei, wlei, wmi

Differential Revision: https://reviews.llvm.org/D104129

Added: 
    

Modified: 
    llvm/include/llvm/ProfileData/SampleProf.h
    llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
    llvm/lib/ProfileData/SampleProf.cpp
    llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-inline.prof
    llvm/test/tools/llvm-profgen/fname-canonicalization.test
    llvm/test/tools/llvm-profgen/inline-cs-dangling-pseudoprobe.test
    llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
    llvm/test/tools/llvm-profgen/merge-cold-profile.test
    llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
    llvm/test/tools/llvm-profgen/truncated-pseudoprobe.test
    llvm/tools/llvm-profgen/ProfileGenerator.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/include/llvm/ProfileData/SampleProf.h b/llvm/include/llvm/ProfileData/SampleProf.h
index 4861cdb98d1d1..2f71bbc6bbbe6 100644
--- a/llvm/include/llvm/ProfileData/SampleProf.h
+++ b/llvm/include/llvm/ProfileData/SampleProf.h
@@ -598,21 +598,9 @@ class FunctionSamples {
   ErrorOr<uint64_t> findSamplesAt(uint32_t LineOffset,
                                   uint32_t Discriminator) const {
     const auto &ret = BodySamples.find(LineLocation(LineOffset, Discriminator));
-    if (ret == BodySamples.end()) {
-      // For CSSPGO, in order to conserve profile size, we no longer write out
-      // locations profile for those not hit during training, so we need to
-      // treat them as zero instead of error here.
-      if (FunctionSamples::ProfileIsCS || FunctionSamples::ProfileIsProbeBased)
-        return 0;
+    if (ret == BodySamples.end())
       return std::error_code();
-    } else {
-      // Return error for an invalid sample count which is usually assigned to
-      // dangling probe.
-      if (FunctionSamples::ProfileIsProbeBased &&
-          ret->second.getSamples() == FunctionSamples::InvalidProbeCount)
-        return std::error_code();
-      return ret->second.getSamples();
-    }
+    return ret->second.getSamples();
   }
 
   /// Returns the call target map collected at a given location.
@@ -890,10 +878,6 @@ class FunctionSamples {
       const DILocation *DIL,
       SampleProfileReaderItaniumRemapper *Remapper = nullptr) const;
 
-  // The invalid sample count is used to represent samples collected for a
-  // dangling probe.
-  static constexpr uint64_t InvalidProbeCount = UINT64_MAX;
-
   static bool ProfileIsProbeBased;
 
   static bool ProfileIsCS;

diff  --git a/llvm/lib/ProfileData/ProfileSummaryBuilder.cpp b/llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
index 6def44e452512..2ab0f0cbc17a5 100644
--- a/llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
+++ b/llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
@@ -113,8 +113,6 @@ void SampleProfileSummaryBuilder::addRecord(
   }
   for (const auto &I : FS.getBodySamples()) {
     uint64_t Count = I.second.getSamples();
-    if (!sampleprof::FunctionSamples::ProfileIsProbeBased ||
-        (Count != sampleprof::FunctionSamples::InvalidProbeCount))
       addCount(Count);
   }
   for (const auto &I : FS.getCallsiteSamples())

diff  --git a/llvm/lib/ProfileData/SampleProf.cpp b/llvm/lib/ProfileData/SampleProf.cpp
index b6abf6e0c3135..b2b786a157865 100644
--- a/llvm/lib/ProfileData/SampleProf.cpp
+++ b/llvm/lib/ProfileData/SampleProf.cpp
@@ -119,16 +119,7 @@ raw_ostream &llvm::sampleprof::operator<<(raw_ostream &OS,
 sampleprof_error SampleRecord::merge(const SampleRecord &Other,
                                      uint64_t Weight) {
   sampleprof_error Result;
-  // With pseudo probes, merge a dangling sample with a non-dangling sample
-  // should result in a dangling sample.
-  if (FunctionSamples::ProfileIsProbeBased &&
-      (getSamples() == FunctionSamples::InvalidProbeCount ||
-       Other.getSamples() == FunctionSamples::InvalidProbeCount)) {
-    NumSamples = FunctionSamples::InvalidProbeCount;
-    Result = sampleprof_error::success;
-  } else {
-    Result = addSamples(Other.getSamples(), Weight);
-  }
+  Result = addSamples(Other.getSamples(), Weight);
   for (const auto &I : Other.getCallTargets()) {
     MergeResult(Result, addCalledTarget(I.first(), I.second, Weight));
   }

diff  --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-inline.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-inline.prof
index fd3ff773e85d0..78ac8109570fa 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-inline.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-inline.prof
@@ -6,6 +6,9 @@
  1: 23
  2: 382920
  3: 382915
+ 4: 0
+ 5: 0
+ 6: 0
  !CFGChecksum: 138828622701
 [bar]:23:23
  1: 23
@@ -15,4 +18,7 @@
  1: 23
  2: 382920
  3: 382915
+ 4: 0
+ 5: 0
+ 6: 0
  !CFGChecksum: 138828622701
\ No newline at end of file

diff  --git a/llvm/test/tools/llvm-profgen/fname-canonicalization.test b/llvm/test/tools/llvm-profgen/fname-canonicalization.test
index 9fbf41abf0a8c..5e7ec51e665a8 100644
--- a/llvm/test/tools/llvm-profgen/fname-canonicalization.test
+++ b/llvm/test/tools/llvm-profgen/fname-canonicalization.test
@@ -22,8 +22,6 @@
 ; CHECK-PROBE-FNAME: !CFGChecksum: 563088904013236
 ; CHECK-PROBE-FNAME:[main:2 @ foo:8 @ _ZL3barii.__uniq.26267048767521081047744692097241227776]:30:15
 ; CHECK-PROBE-FNAME: 1: 15
-; CHECK-PROBE-FNAME: 2: 18446744073709551615
-; CHECK-PROBE-FNAME: 3: 18446744073709551615
 ; CHECK-PROBE-FNAME: 4: 15
 ; CHECK-PROBE-FNAME: !CFGChecksum: 72617220756
 

diff  --git a/llvm/test/tools/llvm-profgen/inline-cs-dangling-pseudoprobe.test b/llvm/test/tools/llvm-profgen/inline-cs-dangling-pseudoprobe.test
index 0ac51c1ad37b2..801de3ba40de0 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-dangling-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-dangling-pseudoprobe.test
@@ -2,14 +2,17 @@
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK:     [main:2 @ foo]:58:0
+; CHECK-NEXT: 1: 0
 ; CHECK-NEXT: 2: 15
 ; CHECK-NEXT: 3: 14
+; CHECK-NEXT: 4: 0
 ; CHECK-NEXT: 5: 14
 ; CHECK-NEXT: 6: 15
+; CHECK-NEXT: 7: 0
+; CHECK-NEXT: 9: 0
 ; CHECK-NEXT: !CFGChecksum: 138950591924
 ; CHECK:[main:2 @ foo:8 @ bar]:1:0
-; CHECK-NEXT: 2: 18446744073709551615
-; CHECK-NEXT: 3: 18446744073709551615
+; CHECK-NEXT: 1: 0
 ; CHECK-NEXT: 4: 1
 ; CHECK-NEXT: !CFGChecksum: 72617220756
 

diff  --git a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
index d1d6a9caf3809..2d23ff487a349 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
@@ -2,17 +2,18 @@
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK:     [main:2 @ foo]:74:0
+; CHECK-NEXT: 1: 0
 ; CHECK-NEXT: 2: 15
 ; CHECK-NEXT: 3: 15
 ; CHECK-NEXT: 4: 14
 ; CHECK-NEXT: 5: 1
 ; CHECK-NEXT: 6: 15
+; CHECK-NEXT: 7: 0
 ; CHECK-NEXT: 8: 14 bar:14
+; CHECK-NEXT: 9: 0
 ; CHECK-NEXT: !CFGChecksum: 138950591924
 ; CHECK:[main:2 @ foo:8 @ bar]:28:14
 ; CHECK-NEXT: 1: 14
-; CHECK-NEXT: 2: 18446744073709551615
-; CHECK-NEXT: 3: 18446744073709551615
 ; CHECK-NEXT: 4: 14
 ; CHECK-NEXT: !CFGChecksum: 72617220756
 

diff  --git a/llvm/test/tools/llvm-profgen/merge-cold-profile.test b/llvm/test/tools/llvm-profgen/merge-cold-profile.test
index 3d749db1ce327..7e975a9045e98 100644
--- a/llvm/test/tools/llvm-profgen/merge-cold-profile.test
+++ b/llvm/test/tools/llvm-profgen/merge-cold-profile.test
@@ -16,10 +16,10 @@
 
 ; CHECK:     [fa]:14:4
 ; CHECK-NEXT: 1: 4
-; CHECK-NEXT: 2: 18446744073709551615
 ; CHECK-NEXT: 3: 4
 ; CHECK-NEXT: 4: 2
 ; CHECK-NEXT: 5: 1
+; CHECK-NEXT: 6: 0
 ; CHECK-NEXT: 7: 2 fb:2
 ; CHECK-NEXT: 8: 1 fa:1
 ; CHECK-NEXT: !CFGChecksum: 120515930909
@@ -28,6 +28,7 @@
 ; CHECK-NEXT: 1: 4
 ; CHECK-NEXT: 2: 3
 ; CHECK-NEXT: 3: 1
+; CHECK-NEXT: 4: 0
 ; CHECK-NEXT: 5: 4 fb:4
 ; CHECK-NEXT: 6: 1 fa:1
 ; CHECK-NEXT: !CFGChecksum: 72617220756
@@ -36,16 +37,17 @@
 ; CHECK-KEEP-COLD-NEXT: 1: 6
 ; CHECK-KEEP-COLD-NEXT: 2: 3
 ; CHECK-KEEP-COLD-NEXT: 3: 3
+; CHECK-KEEP-COLD-NEXT: 4: 0
 ; CHECK-KEEP-COLD-NEXT: 5: 4 fb:4
 ; CHECK-KEEP-COLD-NEXT: 6: 3 fa:3
 ; CHECK-KEEP-COLD-NEXT: !CFGChecksum: 72617220756
 ; CHECK-KEEP-COLD-NEXT: !Attributes: 0
 ; CHECK-KEEP-COLD-NEXT:[fa]:14:4
 ; CHECK-KEEP-COLD-NEXT: 1: 4
-; CHECK-KEEP-COLD-NEXT: 2: 18446744073709551615
 ; CHECK-KEEP-COLD-NEXT: 3: 4
 ; CHECK-KEEP-COLD-NEXT: 4: 2
 ; CHECK-KEEP-COLD-NEXT: 5: 1
+; CHECK-KEEP-COLD-NEXT: 6: 0
 ; CHECK-KEEP-COLD-NEXT: 7: 2 fb:2
 ; CHECK-KEEP-COLD-NEXT: 8: 1 fa:1
 ; CHECK-KEEP-COLD-NEXT: !CFGChecksum: 120515930909
@@ -54,6 +56,7 @@
 ; CHECK-UNMERGED-NEXT: 1: 4
 ; CHECK-UNMERGED-NEXT: 2: 3
 ; CHECK-UNMERGED-NEXT: 3: 1
+; CHECK-UNMERGED-NEXT: 4: 0
 ; CHECK-UNMERGED-NEXT: 5: 4 fb:4
 ; CHECK-UNMERGED-NEXT: 6: 1 fa:1
 ; CHECK-UNMERGED-NEXT: !CFGChecksum: 72617220756
@@ -64,32 +67,38 @@
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 1: 4
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 2: 3
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 3: 1
+; CHECK-COLD-CONTEXT-LENGTH-NEXT: 4: 0
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 5: 4 fb:4
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 6: 1 fa:1
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: !CFGChecksum: 72617220756
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: !Attributes: 0
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT:[fb:6 @ fa]:10:3
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 1: 3
-; CHECK-COLD-CONTEXT-LENGTH-NEXT: 2: 18446744073709551615
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 3: 3
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 4: 1
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 5: 1
+; CHECK-COLD-CONTEXT-LENGTH-NEXT: 6: 0
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 7: 1 fb:1
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 8: 1 fa:1
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: !CFGChecksum: 120515930909
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: !Attributes: 0
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT:[fa:7 @ fb]:6:2
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 1: 2
+; CHECK-COLD-CONTEXT-LENGTH-NEXT: 2: 0
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 3: 2
+; CHECK-COLD-CONTEXT-LENGTH-NEXT: 4: 0
+; CHECK-COLD-CONTEXT-LENGTH-NEXT: 5: 0
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 6: 2 fa:2
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: !CFGChecksum: 72617220756
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: !Attributes: 0
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT:[fa:8 @ fa]:4:1
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 1: 1
-; CHECK-COLD-CONTEXT-LENGTH-NEXT: 2: 18446744073709551615
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 3: 1
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 4: 1
+; CHECK-COLD-CONTEXT-LENGTH-NEXT: 5: 0
+; CHECK-COLD-CONTEXT-LENGTH-NEXT: 6: 0
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: 7: 1 fb:1
+; CHECK-COLD-CONTEXT-LENGTH-NEXT: 8: 0
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: !CFGChecksum: 120515930909
 ; CHECK-COLD-CONTEXT-LENGTH-NEXT: !Attributes: 0
 

diff  --git a/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test b/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
index 8eda5d55ffe82..7dc968e8a39cd 100644
--- a/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
@@ -2,16 +2,18 @@
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK:     [main:2 @ foo]:75:0
+; CHECK-NEXT: 1: 0
 ; CHECK-NEXT: 2: 15
 ; CHECK-NEXT: 3: 15
 ; CHECK-NEXT: 4: 15
+; CHECK-NEXT: 5: 0
 ; CHECK-NEXT: 6: 15
+; CHECK-NEXT: 7: 0
 ; CHECK-NEXT: 8: 15 bar:15
+; CHECK-NEXT: 9: 0
 ; CHECK-NEXT: !CFGChecksum: 138950591924
 ; CHECK:[main:2 @ foo:8 @ bar]:30:15
 ; CHECK-NEXT: 1: 15
-; CHECK-NEXT: 2: 18446744073709551615
-; CHECK-NEXT: 3: 18446744073709551615
 ; CHECK-NEXT: 4: 15
 ; CHECK-NEXT: !CFGChecksum: 72617220756
 

diff  --git a/llvm/test/tools/llvm-profgen/truncated-pseudoprobe.test b/llvm/test/tools/llvm-profgen/truncated-pseudoprobe.test
index c29e2ad4ae7ea..b46479e10e5af 100644
--- a/llvm/test/tools/llvm-profgen/truncated-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/truncated-pseudoprobe.test
@@ -2,17 +2,19 @@
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK:     [foo]:75:0
+; CHECK-NEXT:  1: 0
 ; CHECK-NEXT:  2: 15
 ; CHECK-NEXT:  3: 15
 ; CHECK-NEXT:  4: 15
+; CHECK-NEXT:  5: 0
 ; CHECK-NEXT:  6: 15
+; CHECK-NEXT:  7: 0
 ; CHECK-NEXT:  8: 15 bar:15
+; CHECK-NEXT:  9: 0
 ; CHECK-NEXT:  !CFGChecksum: 563088904013236
 ; CHECK-NEXT:  !Attributes: 0
 ; CHECK:     [foo:8 @ bar]:30:15
 ; CHECK-NEXT:  1: 15
-; CHECK-NEXT:  2: 18446744073709551615
-; CHECK-NEXT:  3: 18446744073709551615
 ; CHECK-NEXT:  4: 15
 ; CHECK-NEXT:  !CFGChecksum: 72617220756
 ; CHECK-NEXT:  !Attributes: 1

diff  --git a/llvm/tools/llvm-profgen/ProfileGenerator.cpp b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
index c2d1753d78f18..0528a77fcc5d8 100644
--- a/llvm/tools/llvm-profgen/ProfileGenerator.cpp
+++ b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
@@ -555,19 +555,14 @@ void PseudoProbeCSProfileGenerator::populateBodySamplesWithProbes(
       }
     }
 
-    // Report dangling probes for frames that have real samples collected.
-    // Dangling probes are the probes associated to an empty block. With this
-    // place holder, sample count on a dangling probe will not be trusted by the
-    // compiler and we will rely on the counts inference algorithm to get the
-    // probe a reasonable count. Use InvalidProbeCount to mark sample count for
-    // a dangling probe.
+    // Assign zero count for remaining probes without sample hits to
+    // 
diff erentiate from probes optimized away, of which the counts are unknown
+    // and will be inferred by the compiler.
     for (auto &I : FrameSamples) {
       auto *FunctionProfile = I.second;
       for (auto *Probe : I.first->getProbes()) {
-        if (Probe->isDangling()) {
-          FunctionProfile->addBodySamplesForProbe(
-              Probe->Index, FunctionSamples::InvalidProbeCount);
-        }
+        if (!Probe->isDangling())
+          FunctionProfile->addBodySamplesForProbe(Probe->Index, 0);
       }
     }
   }


        


More information about the llvm-commits mailing list