[PATCH] Inliner Enhancement

Xinliang David Li xinliangli at gmail.com
Thu Mar 19 09:58:58 PDT 2015


On Wed, Mar 18, 2015 at 11:33 PM, Jiangning Liu <liujiangning1 at gmail.com>
wrote:

> Hi David,
>
>
>> This is mentioned to indicate that oversimplified heuristics can be quite
>> noisy.  More context dependent analysis is needed. The context not only
>> include call arguments, but also things like surrounding calls
>> (optimizations enabled in inline instance, in enclosing caller code, and
>> across different inline instances of different call sites), enclosing loop
>> etc.
>>
>
> I don't think so, and vice versa, the original solution of inlining a
> callee without considering the context of caller is even noisy.
>

The size based inlining can certainly be noisy -- otherwise there won't be
room for improvement :)


> Comparing even more complicated algorithm, yes it is simple, but comparing
> with the original solution, it can avoid code bloat of abnormal programs
> like repeatedly calling the same function a lot of times.
>

so this heuristic is to reduce cold bloat?


>
>
>>
>> Not necessarily for cross module case. The main problem of current pass
>> manager is that many key function analysis results (frequency, branch prob,
>> loop info)  for different functions can not co-exist (live at the same
>> time).  During bottom up CG traversal, when the caller is visited,
>> performing any function analysis for the caller will destroy the analysis
>> info from its callee, making it impossible to incrementally update caller's
>> analysis info by 'importing' callee's info.
>>
>> I think somebody like Chandler is covering the migration from old pass
> manager to the new one, isn't it?
>
>

yes.



> Avoiding compile time increase by changing the inlining order is not good
>> design -- the inline order should be driven by performance consideration.
>>
>
> Agree. If we can move to use the new pass manage at the moment, it would
> be unnecessary indeed.
>
>
right.



> Anyway, I didn't tend to design a complicated algorithm to do inline
> heuristic. I personally think using complicated heuristic for inlining is
> hard to achieve miracle result at such a early complication stage.
>
>
It is indeed not simple, but worth the effort.


> I would also like to say my solution fits into current inliner design
> quite well. My personal experimental result also shows current inline has
> been carefully tuned for very comprehensive cases. For most of scenarios,
> the original inliner can achieve very good trade off between code size and
> performance. Chandler ever mentioned LLVM inliner has historically shown
> over 10% code size advantage across a wide range of benchmark C++
> applications.
>

Adding simple heuristic is easy, the problem with them (without clear
benefit, cost analysis) is that you can easily find a counter example (low
SNR). Those heuristic are common for compilers that do benchmark hacks.

>
> I'm not sure what those more sophisticated heuristic rules you are talking
> about are, but if you tends to tune them for specific benchmark, I would
> say it will definitely not make sense. I would be appreciative if you can
> share your result with community. Then we can see how we
> can reasonably move on.
>

The plan is to tune inliner (both non-PGO and PGO) for real world programs
and cross-validate with SPEC benchmarks.  This work will definitely be
shared with community.

thanks,

David


>
> Thanks,
> -Jiangning
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150319/d8b031e6/attachment.html>


More information about the llvm-commits mailing list