[PATCH] Inliner Enhancement

Xinliang David Li xinliangli at gmail.com
Thu Mar 19 09:58:58 PDT 2015

On Wed, Mar 18, 2015 at 11:33 PM, Jiangning Liu <liujiangning1 at gmail.com>

> Hi David,
>> This is mentioned to indicate that oversimplified heuristics can be quite
>> noisy.  More context dependent analysis is needed. The context not only
>> include call arguments, but also things like surrounding calls
>> (optimizations enabled in inline instance, in enclosing caller code, and
>> across different inline instances of different call sites), enclosing loop
>> etc.
> I don't think so, and vice versa, the original solution of inlining a
> callee without considering the context of caller is even noisy.

The size based inlining can certainly be noisy -- otherwise there won't be
room for improvement :)

> Comparing even more complicated algorithm, yes it is simple, but comparing
> with the original solution, it can avoid code bloat of abnormal programs
> like repeatedly calling the same function a lot of times.

so this heuristic is to reduce cold bloat?

>> Not necessarily for cross module case. The main problem of current pass
>> manager is that many key function analysis results (frequency, branch prob,
>> loop info)  for different functions can not co-exist (live at the same
>> time).  During bottom up CG traversal, when the caller is visited,
>> performing any function analysis for the caller will destroy the analysis
>> info from its callee, making it impossible to incrementally update caller's
>> analysis info by 'importing' callee's info.
>> I think somebody like Chandler is covering the migration from old pass
> manager to the new one, isn't it?


> Avoiding compile time increase by changing the inlining order is not good
>> design -- the inline order should be driven by performance consideration.
> Agree. If we can move to use the new pass manage at the moment, it would
> be unnecessary indeed.

> Anyway, I didn't tend to design a complicated algorithm to do inline
> heuristic. I personally think using complicated heuristic for inlining is
> hard to achieve miracle result at such a early complication stage.
It is indeed not simple, but worth the effort.

> I would also like to say my solution fits into current inliner design
> quite well. My personal experimental result also shows current inline has
> been carefully tuned for very comprehensive cases. For most of scenarios,
> the original inliner can achieve very good trade off between code size and
> performance. Chandler ever mentioned LLVM inliner has historically shown
> over 10% code size advantage across a wide range of benchmark C++
> applications.

Adding simple heuristic is easy, the problem with them (without clear
benefit, cost analysis) is that you can easily find a counter example (low
SNR). Those heuristic are common for compilers that do benchmark hacks.

> I'm not sure what those more sophisticated heuristic rules you are talking
> about are, but if you tends to tune them for specific benchmark, I would
> say it will definitely not make sense. I would be appreciative if you can
> share your result with community. Then we can see how we
> can reasonably move on.

The plan is to tune inliner (both non-PGO and PGO) for real world programs
and cross-validate with SPEC benchmarks.  This work will definitely be
shared with community.



> Thanks,
> -Jiangning
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150319/d8b031e6/attachment.html>

More information about the llvm-commits mailing list