[PATCH] D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline

Fri Mar 4 13:48:12 PST 2022

davidxl added inline comments.

================
Comment at: llvm/lib/Transforms/IPO/Inliner.cpp:952
+              if (CalleeSCC != C &&
+                  CalleeSCC == CG.lookupSCC(CG.get(*NewCallee))) {
+                ICB->addFnAttr(Attribute::NoInline);
----------------
nikic wrote:
> davidxl wrote:
> > nikic wrote:
> > > davidxl wrote:
> > > > This feels like very strict. Is it possible to allow some level (under an option) of iterative inlining into the same SCC?     This is because not all cases of inlining into a non-trivial SCC will result in compile time issue due to existing threshold based stopping mechanism (i.e. the new callee in SCC is too large).
> > > > 
> > > > 
> > > This does not restrict inlining into the same SCC. What it prevents is continuously peeling off more and more calls from a *different* non-trivial SCC.
> > > 
> > > From what I can tell, this is a rare case, only two programs in llvm-test-suite show any codegen changes with this patch.
> > To clarify, my understanding of the scenario is like the following
> > 
> > 
> > foo -->   (a, b, c)    where foo calls 'a'.
> > 
> > where a, b, and c forms anon trivial SCC.
> > 
> > Without the fix, we can end up with  inlining of foo->a, foo->b, foo->c, foo->a, foo->b, .... until foo blows up (we don't have a caller limit set in the inliner).
> > 
> > With this patch, once foo-> a is inlined, the new callsite foo->b is marked as no inline.
> > 
> > Is that the right?
> Yeah, that sounds about right. I believe local inlining history will actually make sure that we only inline foo->a, foo->b, foo->c, which is what prevents this from simply inlining infinitely. But it still ends up flattening the SCC, which is particularly problematic if there are multi-edges in the SCC, in which case flattening is exponential.
> 
> D98481 forbids the foo->a inline in the first place, while this patch allows inlining foo->a, but forbids the foo->b inline. There are some cases where this kind of single-level inline is probably useful, e.g. if inlining at this call-site can prune away the recursive parts entirely.
What I am thinking is a light weight mechanism to fix this with more flexibility. Assuming the number of callsites (from outside of SCC) into non-trivial SCC is small, a map from the <caller, non-trivialSCC> pairs to inlined count can be used to track the number of iterative inlining happening between the caller and SCC. If it exceeds the limit, mark the new callsite noInline.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120584/new/

https://reviews.llvm.org/D120584