[PATCH] D58391: [TailCallElim] Add tailcall elimination pass to LTO Pipelines

Robert Lougher via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 19 10:40:54 PST 2019


rob.lougher created this revision.
rob.lougher added reviewers: hfinkel, rnk, chandlerc.
Herald added subscribers: jdoerfert, dexonsmith, steven_wu, kosarev, inglorion, mehdi_amini.
Herald added a project: LLVM.

If the following simple program is compiled with LTO the call to foobar() will not be tailcall optimized. This is because the tailcall elimination pass is only ran in the initial compilation step. This means link-time inlining is not visible to it.

  ------------ 1.c ----------------
  extern void foobar(void);
  extern void bar(int *);
  
  void foo() {
    int a[10];
    bar(a);
    foobar();
  }
  --------------------------------
  ------------ 2.c ----------------
  void bar(int *p) {
    *p = 10;
  }
  --------------------------------

$ clang -flto 1.c 2.c -c -O2
$ llvm-lto 1.o 2.o --exported-symbol=foo -save-merged-module -o 3.o
$ llvm-dis 3.o.merged.bc -o -

  ...
  ; Function Attrs: nounwind uwtable
  define dso_local void @foo() local_unnamed_addr #0 {
  entry:
    call void @foobar() #2
    ret void
  }
  ...

Even without link-time inlining, LTO may be able to perform additional tailcall optimization due to the visibility of the nocapture attribute. For example, if the program above is modified to make bar() noinline, foobar() can still be tailcalled as the parameter to bar() is marked nocapture:

  ; Function Attrs: noinline norecurse nounwind uwtable writeonly
  define internal fastcc void @bar(i32* nocapture %p) unnamed_addr #3 {
  entry:
    store i32 10, i32* %p, align 4, !tbaa !4
    ret void
  }

(Before D53519 <https://reviews.llvm.org/D53519>, this case would not have been optimized due to the lifetime markers.)


Repository:
  rL LLVM

https://reviews.llvm.org/D58391

Files:
  lib/Passes/PassBuilder.cpp
  lib/Transforms/IPO/PassManagerBuilder.cpp
  test/LTO/X86/tailcallelim.ll
  test/Other/new-pm-lto-defaults.ll


Index: test/Other/new-pm-lto-defaults.ll
===================================================================
--- test/Other/new-pm-lto-defaults.ll
+++ test/Other/new-pm-lto-defaults.ll
@@ -81,6 +81,7 @@
 ; CHECK-O2-NEXT: Running pass: JumpThreadingPass
 ; CHECK-O2-NEXT: Running analysis: LazyValueAnalysis
 ; CHECK-O2-NEXT: Running pass: SROA on foo
+; CHECK-O2-NEXT: Running pass: TailCallElimPass on foo
 ; CHECK-O2-NEXT: Finished llvm::Function pass manager run.
 ; CHECK-O2-NEXT: Running pass: ModuleToPostOrderCGSCCPassAdaptor<{{.*}}PostOrderFunctionAttrsPass>
 ; CHECK-O2-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.*}}PassManager{{.*}}>
Index: test/LTO/X86/tailcallelim.ll
===================================================================
--- test/LTO/X86/tailcallelim.ll
+++ test/LTO/X86/tailcallelim.ll
@@ -0,0 +1,22 @@
+; Check that the LTO pipelines add the Tail Call Elimination pass.
+
+; RUN: llvm-as < %s > %t1
+; RUN: llvm-lto -o %t2 %t1 --exported-symbol=foo -save-merged-module
+; RUN: llvm-dis < %t2.merged.bc | FileCheck %s
+
+; RUN: llvm-lto2 run -r %t1,foo,plx -r %t1,bar,plx -o %t3 %t1 -save-temps
+; RUN: llvm-dis < %t3.0.4.opt.bc | FileCheck %s
+
+; RUN: llvm-lto2 run -r %t1,foo,plx -r %t1,bar,plx -o %t4 %t1 -save-temps -use-new-pm
+; RUN: llvm-dis < %t4.0.4.opt.bc | FileCheck %s
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @foo() {
+; CHECK: tail call void @bar()
+  call void @bar()
+  ret void
+}
+
+declare void @bar()
Index: lib/Transforms/IPO/PassManagerBuilder.cpp
===================================================================
--- lib/Transforms/IPO/PassManagerBuilder.cpp
+++ lib/Transforms/IPO/PassManagerBuilder.cpp
@@ -864,6 +864,10 @@
   // Break up allocas
   PM.add(createSROAPass());
 
+  // LTO provides additional opportunities for tailcall elimination due to
+  // link-time inlining, and visibility of nocapture attribute.
+  PM.add(createTailCallEliminationPass());
+
   // Run a few AA driven optimizations here and now, to cleanup the code.
   PM.add(createPostOrderFunctionAttrsLegacyPass()); // Add nocapture.
   PM.add(createGlobalsAAWrapperPass()); // IP alias analysis.
Index: lib/Passes/PassBuilder.cpp
===================================================================
--- lib/Passes/PassBuilder.cpp
+++ lib/Passes/PassBuilder.cpp
@@ -1147,6 +1147,10 @@
   // Break up allocas
   FPM.addPass(SROA());
 
+  // LTO provides additional opportunities for tailcall elimination due to
+  // link-time inlining, and visibility of nocapture attribute.
+  FPM.addPass(TailCallElimPass());
+
   // Run a few AA driver optimizations here and now to cleanup the code.
   MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));
 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D58391.187402.patch
Type: text/x-patch
Size: 2777 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190219/327c24e6/attachment.bin>


More information about the llvm-commits mailing list