[PATCH] Inliner Enhancement

Thu Mar 19 19:53:43 PDT 2015

Hi Yin,

2015-03-20 7:04 GMT+08:00 Yin Ma <yinma at codeaurora.org>:

> Hi Jiangning,
>
> I reviewed your code. I like the loop analyzer. However I have two
> concerns for your implementation for A->B->Cx problem.
>
> 1. When you visit a SCC, like B if any Cx will be inlined into B, you will
> not inline Cx into B until visiting A for (unsigned i = 0; i <
> CallSitesForABC.size(); i++) if (!shouldInline(CallSitesForABC[i], ABC,
> false)) if (ABC) return false; If so, the logic is not complete, because if
> B is also an extern function, none of Cx will be inlined into B at the end
> because each SCC will be only visited once. At the end, only A will be
> inlined with B and C, B becomes an un-inlined version, which is a big
> problem. I didn’t see the code or understand where to make sure all C->B
> are inlined correctly.
>

This is great, and I didn't notice this problem. I will try to combine your
patch with mine to see what would happen.

> 2. For LLVM, when inliner inlines A->B->C, the original order is bottom
> up, so C->B then CB->A. Actually, this order matters because for every step
> of inlining, the inliner will call a code simplifier pass to optimize the
> inlined code. The bottom up order will make C->B code smaller. In some
> cases, if you delayed C into B for A,  the order will changes to B->A then
> all C->BA. Simplifier changes, you will see performance regression in some
> cases due to order change. So you have to make sure after you come to A,
> you still use bottom up inlining order to inline Cx -> B then -> A.
>

Yeah, you are correct. But the change from bottom-up to top-down would just
stop when coming across a big function, which couldn't be further inlined.
So essentially it is still bottom-up, and for small functions closing to
leaf we change to use top-down. Yes, there will be some regressions, but
there will be some improvements as well, because we will be able to inline
more callees to A, and the are all small functions. My experiment of
applying this change independently shows performance change are all
fluctuations, some are good and some others are bad. Overall, I didn't see
improvement or regression. But using this method independently can reduce
compile-time a lot. You may have a try on llvm bootstrap.

> Actually, after we meet in the LLVM dev. I have implemented my own version
> of deferred inlining for solving A->B->C. I called it as delayed inlining.
> I will upload it into phabricator for you and everybody to review. I will
> post the link today or tomorrow.
>
> Yin
>
>
> REPOSITORY
>   rL LLVM
>
> http://reviews.llvm.org/D8408
>
> EMAIL PREFERENCES
>   http://reviews.llvm.org/settings/panel/emailpreferences/
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150320/2db9c68e/attachment.html>