[PATCH] D40097: [Inliner][NewPM] Inline functions outside of current SCC first

Haicheng Wu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 15 13:08:13 PST 2017


haicheng created this revision.
Herald added a subscriber: mcrosier.

When we enabled the new PM, we noticed several big regressions.  One reason is that the new PM has different inline orders.  The legacy PM has a step that the new PM does not have which is moving the call sites calling functions in the current SCC to the end of the iteration list.  I don't know whether the new PM omits this intentionally or not, but I can see two benefits of doing this.

1. This step first inlines functions outside of the current SCC to discourage functions inside the same SCC inlining each other which can bloat up the code size.



1. Inlining a callee inside the current SCC likely makes the caller recursive.  LLVM does not inline any recursive functions.

One drawback I can think of is  that callsites from the same caller are stored in two places instead of one.  Thus, we may have to switch function proxies more often.

This patch just copied the code from the legacy PM to the new PM.  Here is the SPEC performance and code size change

|                 | code size (%) (- is smaller) | performance (%) (+ is faster) |
| spec2000/ammp   | -1.32                        | +0.02                         |
| spec2000/vortex | -1.06                        | -0.21                         |
| spec2006/gobmk  | +0.8                         | +1.22                         |
| spec2006/povray | -0.11                        | +31.90                        |
| spec2017/leela  | -1.67                        | -0.78                         |
| spec2017/povray | -0.24                        | +27.12                        |


Repository:
  rL LLVM

https://reviews.llvm.org/D40097

Files:
  lib/Transforms/IPO/Inliner.cpp
  test/Transforms/Inline/cgscc-order.ll


Index: test/Transforms/Inline/cgscc-order.ll
===================================================================
--- /dev/null
+++ test/Transforms/Inline/cgscc-order.ll
@@ -0,0 +1,36 @@
+; RUN: opt < %s -passes='cgscc(inline)' -S -inline-threshold=30 | FileCheck %s
+
+ at glbl = external global i32
+
+define void @out() {
+  store i32 0, i32* @glbl
+  store i32 1, i32* @glbl
+  store i32 2, i32* @glbl
+  store i32 3, i32* @glbl
+  store i32 4, i32* @glbl
+  store i32 5, i32* @glbl
+  store i32 6, i32* @glbl
+  store i32 7, i32* @glbl
+  store i32 8, i32* @glbl
+  store i32 9, i32* @glbl
+  ret void
+}
+
+define void @scc_a() {
+entry:
+  call void @scc_b()
+  call void @out()
+  ret void
+}
+
+define void @scc_b() {
+; CHECK-LABEL: define void @scc_b(
+; Make sure out is inlined into scc_a first so that scc_a is too big to be 
+; inined into scc_b
+entry:
+; CHECK:    call
+; CHECK-NEXT:    ret
+  call void @scc_a()
+  ret void
+}
+
Index: lib/Transforms/IPO/Inliner.cpp
===================================================================
--- lib/Transforms/IPO/Inliner.cpp
+++ lib/Transforms/IPO/Inliner.cpp
@@ -830,6 +830,8 @@
   // incrementally maknig a single function grow in a super linear fashion.
   SmallVector<std::pair<CallSite, int>, 16> Calls;
 
+  SmallPtrSet<Function *, 8> SCCFunctions;
+
   // Populate the initial list of calls in this SCC.
   for (auto &N : InitialC) {
     // We want to generally process call sites top-down in order for
@@ -842,11 +844,20 @@
         if (Function *Callee = CS.getCalledFunction())
           if (!Callee->isDeclaration())
             Calls.push_back({CS, -1});
+    SCCFunctions.insert(&N.getFunction());
   }
   if (Calls.empty())
     return PreservedAnalyses::all();
 
-  // Capture updatable variables for the current SCC and RefSCC.
+  // Now that we have all of the call sites, move the ones to functions in the
+  // current SCC to the end of the list.
+  unsigned FirstCallInSCC = Calls.size();
+  for (unsigned i = 0; i < FirstCallInSCC; ++i)
+    if (Function *F = Calls[i].first.getCalledFunction())
+      if (SCCFunctions.count(F))
+        std::swap(Calls[i--], Calls[--FirstCallInSCC]);
+
+  // Capture updatable variables for the current SC and RefSCC.
   auto *C = &InitialC;
   auto *RC = &C->getOuterRefSCC();
 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D40097.123070.patch
Type: text/x-patch
Size: 2300 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171115/bd1ec082/attachment.bin>


More information about the llvm-commits mailing list