[PATCH] D92347: [CSSPGO] Consume pseudo-probe-based AutoFDO profile

Hongtao Yu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Dec 13 21:22:16 PST 2020


hoy marked 5 inline comments as done.
hoy added inline comments.


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:554-556
+      // A missing entry for a probe likely means the probe was not executed.
+      // Treat it as a zero count instead of an unknown count to help edge
+      // weight inference.
----------------
wmi wrote:
> I guess it makes sense to treat missing entry to have zero count instead of unknown because the function profiles with unmatched CFGchecksum has been dropped, is that right? 
> 
> If it is, can we check CFGchecksum not zero here instead of ProfileIsProbeBased or ProfileIsCS? When CFGchecksum is not zero and matched, we are confident the profile is accurate no matter what kind of profile it is. 
> 
> Another thing is: I think debug info based profile can have a similar checksum then it can use a similar strategy here as probe based profile. It may not be a CFGchecksum because debug info based profile has no idea about CFG, so it is better to rename CFGchecksum to something more general.   
I think the comment I put here is a bit confusing. A missing entry for a probe here really means a missing counter. When a probe counter doesn't appear in a function profile, we are sure that the probe has not been executed (virtually). From our experiment, giving it a zero count is a bit more accurate instead of having the counts inference algorithm to figure out a count for it.

We use CFGchecksum to identify if a profile is debug-info-based or pseudo-probe-based. A probe-based profile will likely have a non-zero checksum due to the way CFGchecksum is computed. As you pointed out, when CFGchecksum is matched, we are confident the profile is accurate (even if there are missing counters for some probes). If not matched, the profile will be completely dropped and no counters will be applied to the function..

It's a good suggestion to apply a similar checksum for debug-info-based profile to tell how close the pervious source is to the current IR. We can discuss more about this.





================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:675-677
+    if (FunctionHash == 0) {
+      // Set the function hash code for the target profile.
+      FunctionHash = Other.getFunctionHash();
----------------
wmi wrote:
> Is it correct that for probe based profile all the FunctionHash should already be initialized to a non-zero value?
A FunctionHash is `SampleProfileProber::computeCFGHash` which in theory could have zero value, though in practice we haven't seen that even with a single-block function.


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:681-682
+      // either:
+      // 1. They are same-named static functions from different compilation
+      // units, or
+      // 2. They are really the same function but from different compilations.
----------------
wmi wrote:
> Item1 won't be an issue if -funique-internal-linkage-names becomes default (https://reviews.llvm.org/rGad1b9daa4bf40c1907794fd5de7807aad1f0553c). 
Thanks for pointing this out! The switch will be helpful. Comment updated.


================
Comment at: llvm/lib/IR/PseudoProbe.cpp:37-43
+        PseudoProbe Probe;
+        Probe.Id =
+            PseudoProbeDwarfDiscriminator::extractProbeIndex(Discriminator);
+        Probe.Type =
+            PseudoProbeDwarfDiscriminator::extractProbeType(Discriminator);
+        Probe.Attr = PseudoProbeDwarfDiscriminator::extractProbeAttributes(
+            Discriminator);
----------------
wmi wrote:
> Can you wrap them into a function like extractProbeFromDiscriminator?
Done.


================
Comment at: llvm/lib/ProfileData/SampleProf.cpp:135
 /// Print the samples collected for a function on stream \p OS.
 void FunctionSamples::print(raw_ostream &OS, unsigned Indent) const {
   OS << TotalSamples << ", " << TotalHeadSamples << ", " << BodySamples.size()
----------------
wmi wrote:
> Should we output CFGchecksum if it is supported? It will affect the output of 'llvm-profdata show -sample'
Good point, added printing for CFGchecksum.


================
Comment at: llvm/lib/ProfileData/SampleProfWriter.cpp:174
+    for (const auto &Entry : Profiles) {
+      writeNameIdx(Entry.first());
+      encodeULEB128(Entry.second.getFunctionHash(), OS);
----------------
wmi wrote:
> What about the case that the profile use MD5 to represent name?
Good question. I thought this is orthogonal from MD5 but correct me if I'm wrong. Does a name index point to the name table that are filled with MD5 codes?


================
Comment at: llvm/tools/llvm-profdata/llvm-profdata.cpp:684-686
+    assert(!ProfileIsProbeBased ||
+           ProfileIsProbeBased == FunctionSamples::ProfileIsProbeBased);
+    ProfileIsProbeBased = FunctionSamples::ProfileIsProbeBased;
----------------
wmi wrote:
> This should use exitWithError instead of assertion for user to get the error message in release mode, same as you did for compare below.
Sounds good.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92347/new/

https://reviews.llvm.org/D92347



More information about the llvm-commits mailing list