[PATCH] D118640: [CSSPGO] Even count distribution

Wed Feb 16 09:21:18 PST 2022

spupyrev added inline comments.

================
Comment at: llvm/lib/Transforms/Utils/SampleProfileInference.cpp:333
+          // If we haven't seen Edge.Dst so far, continue DFS search there
+          if (Dst.Discovery == 0 && Dst.NumCalls < SampleProfileMaxDfsCalls) {
+            Dst.Discovery = ++Time;
----------------
hoy wrote:
> hoy wrote:
> > Should the Discovery field be should a boolean? Looks like this is the only place using it.
> I was wondering if setting `SampleProfileMaxDfsCalls=1` makes sense. It looks like whether `Dst` is taken or not doesn't matter with which predecessor the searching starts with. Correct me if I'm wrong. 
Quoting my colleague: 

> I would keep it the way it is. Discovery and Finish times are well known DFS concepts. Of course, here they have been adapted to our modified version that can visit a node more than once. The marginal savings to switching to bool are not worth it.

================
Comment at: llvm/lib/Transforms/Utils/SampleProfileInference.cpp:333
+          // If we haven't seen Edge.Dst so far, continue DFS search there
+          if (Dst.Discovery == 0 && Dst.NumCalls < SampleProfileMaxDfsCalls) {
+            Dst.Discovery = ++Time;
----------------
spupyrev wrote:
> hoy wrote:
> > hoy wrote:
> > > Should the Discovery field be should a boolean? Looks like this is the only place using it.
> > I was wondering if setting `SampleProfileMaxDfsCalls=1` makes sense. It looks like whether `Dst` is taken or not doesn't matter with which predecessor the searching starts with. Correct me if I'm wrong. 
> Quoting my colleague: 
> 
> > I would keep it the way it is. Discovery and Finish times are well known DFS concepts. Of course, here they have been adapted to our modified version that can visit a node more than once. The marginal savings to switching to bool are not worth it.
> 
> 
That is a good point. We actually ran experiments with the flag but found that the default value of 10 is fine.

The tradeoff here is quality vs speed: the more iterations we apply, the higher quality we get. We certainly need more than 1 iteration (thus, `SampleProfileMaxDfsCalls>1`), otherwise we get poor results. Increasing it beyond 10-20 doesn't significantly improve the quality. Time savings of reducing the value from 10 to, say, 5 are minor.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118640/new/

https://reviews.llvm.org/D118640