<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Hi David,</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><div>This is mentioned to indicate that oversimplified heuristics can be quite noisy.  More context dependent analysis is needed. The context not only include call arguments, but also things like surrounding calls (optimizations enabled in inline instance, in enclosing caller code, and across different inline instances of different call sites), enclosing loop etc.<br></div></span></div></div></div></blockquote><div><br></div><div>I don't think so, and vice versa, the original solution of inlining a callee without considering the context of caller is even noisy. Comparing even more complicated algorithm, yes it is simple, but comparing with the original solution, it can avoid code bloat of abnormal programs like repeatedly calling the same function a lot of times. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><div><br></div></span><div>Not necessarily for cross module case. The main problem of current pass manager is that many key function analysis results (frequency, branch prob, loop info)  for different functions can not co-exist (live at the same time).  During bottom up CG traversal, when the caller is visited, performing any function analysis for the caller will destroy the analysis info from its callee, making it impossible to incrementally update caller's analysis info by 'importing' callee's info. </div><div><br></div></div></div></div></blockquote><div>I think somebody like Chandler is covering the migration from old pass manager to the new one, isn't it?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><div>Avoiding compile time increase by changing the inlining order is not good design -- the inline order should be driven by performance consideration.</div></div></div></div></blockquote><div><br></div><div>Agree. If we can move to use the new pass manage at the moment, it would be unnecessary indeed.</div><div><br></div><div>Anyway, I didn't tend to design a complicated algorithm to do inline heuristic. I personally think using complicated heuristic for inlining is hard to achieve miracle result at such a early complication stage.</div><div><br></div><div>I would also like to say my solution fits into current inliner design quite well. My personal experimental result also shows current inline has been carefully tuned for very comprehensive cases. For most of scenarios, the original inliner can achieve very good trade off between code size and performance. Chandler ever mentioned LLVM inliner has <span style="font-size:14px">historically shown over 10% code size advantage across a wide range of benchmark C++ applications.</span></div><div><span style="font-size:14px"><br></span></div><div><span style="font-size:14px">I'm not sure what those more sophisticated heuristic rules you are talking about are, but if you tends to tune them for specific benchmark, I would say it will definitely not make sense. I would be appreciative if you can share your result with community. Then we can see how we can reasonably move on.</span></div><div><br></div><div>Thanks,</div><div>-Jiangning </div><div><br></div></div></div></div>