[PATCH] Inliner Enhancement
yinma at codeaurora.org
Thu Mar 19 16:04:18 PDT 2015
I reviewed your code. I like the loop analyzer. However I have two concerns for your implementation for A->B->Cx problem.
1. When you visit a SCC, like B if any Cx will be inlined into B, you will not inline Cx into B until visiting A for (unsigned i = 0; i < CallSitesForABC.size(); i++) if (!shouldInline(CallSitesForABC[i], ABC, false)) if (ABC) return false; If so, the logic is not complete, because if B is also an extern function, none of Cx will be inlined into B at the end because each SCC will be only visited once. At the end, only A will be inlined with B and C, B becomes an un-inlined version, which is a big problem. I didn’t see the code or understand where to make sure all C->B are inlined correctly.
2. For LLVM, when inliner inlines A->B->C, the original order is bottom up, so C->B then CB->A. Actually, this order matters because for every step of inlining, the inliner will call a code simplifier pass to optimize the inlined code. The bottom up order will make C->B code smaller. In some cases, if you delayed C into B for A, the order will changes to B->A then all C->BA. Simplifier changes, you will see performance regression in some cases due to order change. So you have to make sure after you come to A, you still use bottom up inlining order to inline Cx -> B then -> A.
Actually, after we meet in the LLVM dev. I have implemented my own version of deferred inlining for solving A->B->C. I called it as delayed inlining. I will upload it into phabricator for you and everybody to review. I will post the link today or tomorrow.
More information about the llvm-commits