[PATCH] D107800: [CSSPGO][llvm-profgen] Cap context stack to reduce memory usage

Wed Aug 11 13:47:08 PDT 2021

wenlei accepted this revision.
wenlei added inline comments.
This revision is now accepted and ready to land.

================
Comment at: llvm/tools/llvm-profgen/ProfileGenerator.cpp:53
+cl::opt<int> CSProfCtxStackCap(
+    "csprof-ctx-stack-cap", cl::init(20), cl::ZeroOrMore,
+    cl::desc("Cap context stack at a given depth. No cap if the input is -1."));
----------------
wenlei wrote:
> wlei wrote:
> > wlei wrote:
> > > hoy wrote:
> > > > wenlei wrote:
> > > > > I think we could unify the switch names, e.g. `csprof-max-context-depth` and `csprof-max-cold-context-depth`? 
> > > > Thanks for working on this. We probably do not inline so many levels of functions. But would be good to run through some perf testing or to turn this off by default.
> > > Sounds good, will collect the statistic of the max inline depth in SampleProfile inliner on some benchmarks and change to that one, maybe 10 is good enough.  
> > Here is the max inline depth(a inline b, then b inline c, the depth is 2) in SPEC2017 monoLTO pass2 (turn on all inliners).
> > 
> > ```
> > 508.namd_r  5
> > 510.parest_r 21
> > 511.povray_r  8
> > 526.blender_r 15
> > 600.perlbench_s 8
> > 602.gcc_s 21 
> > 605.mcf_s 5
> > 620.omnetpp_s 18
> > 623.xalancbmk_s 26
> > 625.x264_s 7
> > 631.deepsjeng_s 5
> > 638.imagick_s 10
> > 641.leela_s 16
> > 644.nab_s 5
> > 657.xz_s 7
> > ```
> > and for the clang-10 pass1 binary(I don't have pass2 binary), the max inline depth is 51!
> > 
> > it's really more inlining than I thought.  so I agree with you to turn it off(-1) by default.
> > 
> > 
> > 
> Sounds reasonable. If we run into such situation more often, we could also try to have another level of aggregation by leaf frame from stack sample, then we can tell some contexts are cold before unwinding, and dynamically trim those cold context during unwinding.
> 
> Can we make the description and variable name consistent with CSProfColdContextFrameDepth too? 
to be specific: CSProfMaxContextDepth, CSProfMaxColdContextDepth. "Keep the last K frames while merging [cold] profile ..." otherwise the change looks good. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107800/new/

https://reviews.llvm.org/D107800