[PATCH] D70584: [AutoFDO] Statistic for context sensitive profile guided inlining

Mon Nov 25 15:14:39 PST 2019

wenlei marked 4 inline comments as done.
wenlei added a comment.

> I would imagine among all the functions with inline instance in profile, only those small and warm/cold functions which are not inlined in early inliner will be inlined in regular inliner.

Yes, small functions are the majority. But there're still others.

> For small and cold functions, I think it doesn't matter whether they are inlined early or late.

It matters for post-inline profile quality. If it's inlined early by replay inliner of sample loader, the context sensitive profile will be kept. However, if replay inliner rejects them, and later they got inlined by CGSCC inliner, we will have to do count scaling for inlinee so it's not as accurate as if we inline early and preserve context-sensitive profile. This doesn't matter much for the result of inline decision, but it matters for post-inline profile quality which can affect block layout later.

================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:879
   if (Cost.isNever()) {
-    ORE->emit(OptimizationRemark(DEBUG_TYPE, "Not inline", DLoc, BB)
+    ORE->emit(OptimizationRemark("sample-profile-inline", "NotInline", DLoc, BB)
               << "incompatible inlining");
----------------
wmi wrote:
> The first parameter in the declaration of OptimizationRemark is "const char *PassName", so why not use DEBUG_TYPE?  
> 
> I feel "NeverInline" may be more clear than "NotInline" in terms of showing it is illegal to inline. 
DEBUG_TYPE is "sample-profile" and I feel it's too broad for practical uses as it covers sample usages, weight propagation, and inlining. I wanted to have something that only gives me inlining remarks, thus using  "sample-profile-inline" instead here.

Good point about "NeverInline", will change. 

================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:941-952
+      else {
+        for (auto I : Candidates) {
+          Function *CalledFunction = CallSite(I).getCalledFunction();
+          if (CalledFunction) {
+            ORE->emit(OptimizationRemark("sample-profile-inline", "NotInline",
+                                         I->getDebugLoc(), I->getParent())
+                      << "previous inline not repeated '"
----------------
wmi wrote:
> Not inlined candidate may be reported multiple times here because of the iterative outer loop. 
> 
> I guess you put the OptimizationRemark here because you want to know the exact reason of why the candidate with inline instance in profile is not inlined (here the reason is not hot enough), then some more information should be emitted to explain it.
> 
> If you don't care the exact reason, then it is better to generate the optimization remark in the loop iterating localNotInlinedCallSites.  localNotInlinedCallSites contains all the candidates with inline instance in profile but not being inlined for whatever reason including the reason of "not hot enough". 
Yes, I care about the reasons. I will change the message to make the reason explicit in the output remarks. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70584/new/

https://reviews.llvm.org/D70584