[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)

Teresa Johnson tejohnson at google.com
Mon Jun 8 09:03:20 PDT 2015


Talked to Eric Fri and he suggested that this might be the first of
several places where we want behavior of LTO compiles to diverge from
normal -O2 compiles. So for now I have implemented this such that we
pass down -flto to the -cc1 job, and that gets propagated as a code
gen option and into the PassManagerBuilder. I have left the current
logic translating -flto to the -emit-llvm-bc option, since IMO that is
cleaner for controlling the output type. Let me know what you think of
saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or
if is it better to record just the individual behaviors we want (i.e.
change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to
"ElimAvailExtern" or something like that).

Patches attached. I modified an existing available external clang test
to check for the new O2 (LTO vs not) behaviors.

Thanks,
Teresa

On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com> wrote:
>
>
> On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com> wrote:
>>
>> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com> wrote:
>> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson <tejohnson at google.com>
>> > wrote:
>> >>
>> >> Agreed. Although I assume you mean invoke the new pass under a
>> >> ThinLTO-only option so that avail extern are not dropped in the
>> >> compile pass before the LTO link?
>> >
>> >
>> > No, this pass would actually be an improvement to the standard -O2
>> > pipeline.
>> > The special case is the regular LTO pass pipeline, which wants to run
>> > GlobalDCE but doesn't want to drop available_externally function
>> > definitions
>> > until after linker-stage inlining.
>>
>> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses
>> this same -O2 optimization pipeline as without LTO. Clang communicates
>> the fact that we are doing an LTO compile to llvm via the
>> -emit-llvm-bc flag, which just tells it to emit bitcode and exit
>> before codegen. So I would either need to key off of that or setup a
>> new flag to indicate to llvm that we are doing an LTO -c compile. Or
>> is there some other way that I am missing?
>>
>> Incidentally, we'll also want to skip this new pass and keep any
>> referenced avail extern functions in the ThinLTO -c compile step for
>> the same reasons (and there are no imported functions yet at that
>> point).
>>
>
> Ultimately for any planned LTO build we're going to want to do a different
> pass pipeline, it's probably worth considering what should be done before
> and during LTO.
>
> -eric
>
>>
>> Teresa
>>
>> >
>> > Consider this test case:
>> >
>> > declare void @blah()
>> > define i32 @main() {
>> >   call void @foo()
>> >   ret i32 0
>> > }
>> > define available_externally void @foo() noinline {
>> >   call void @bar()
>> >   ret void
>> > }
>> > define linkonce_odr void @bar() noinline {
>> >   call void @blah()
>> >   ret void
>> > }
>> >
>> > If I run opt -O2 on this and feed it to llc, it actually generates code
>> > for
>> > bar, even though there are no actual references to bar in the final
>> > code:
>> >
>> > main:                                   # @main
>> >         pushq   %rax
>> >         callq   foo
>> >         xorl    %eax, %eax
>> >         popq    %rdx
>> >         retq
>> >
>> > bar:                                    # @bar
>> >         jmp     blah                    # TAILCALL
>> >
>> > This corner case happens to come up organically with dllimported
>> > classes,
>> > which is why I happened to think of it. :) I'm happy with a flag to
>> > globaldce for LTO and the original patch, honestly.
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
-------------- next part --------------
Index: include/llvm/InitializePasses.h
===================================================================
--- include/llvm/InitializePasses.h	(revision 237590)
+++ include/llvm/InitializePasses.h	(working copy)
@@ -130,6 +130,7 @@ void initializeSanitizerCoverageModulePass(PassReg
 void initializeDataFlowSanitizerPass(PassRegistry&);
 void initializeScalarizerPass(PassRegistry&);
 void initializeEarlyCSELegacyPassPass(PassRegistry &);
+void initializeElimAvailExternPass(PassRegistry&);
 void initializeExpandISelPseudosPass(PassRegistry&);
 void initializeFunctionAttrsPass(PassRegistry&);
 void initializeGCMachineCodeAnalysisPass(PassRegistry&);
Index: include/llvm/Transforms/IPO.h
===================================================================
--- include/llvm/Transforms/IPO.h	(revision 237590)
+++ include/llvm/Transforms/IPO.h	(working copy)
@@ -71,6 +71,12 @@ ModulePass *createGlobalOptimizerPass();
 ModulePass *createGlobalDCEPass();
 
 //===----------------------------------------------------------------------===//
+/// createElimAvailExternPass - This transform is designed to eliminate
+/// available external globals (functions or global variables)
+///
+ModulePass *createElimAvailExternPass();
+
+//===----------------------------------------------------------------------===//
 /// createGVExtractionPass - If deleteFn is true, this pass deletes
 /// the specified global values. Otherwise, it deletes as much of the module as
 /// possible, except for the global values specified.
Index: include/llvm/Transforms/IPO/PassManagerBuilder.h
===================================================================
--- include/llvm/Transforms/IPO/PassManagerBuilder.h	(revision 237590)
+++ include/llvm/Transforms/IPO/PassManagerBuilder.h	(working copy)
@@ -121,6 +121,7 @@ class PassManagerBuilder {
   bool VerifyInput;
   bool VerifyOutput;
   bool MergeFunctions;
+  bool LTO;
 
 private:
   /// ExtensionList - This is list of all of the extensions that are registered.
Index: lib/Transforms/IPO/CMakeLists.txt
===================================================================
--- lib/Transforms/IPO/CMakeLists.txt	(revision 237590)
+++ lib/Transforms/IPO/CMakeLists.txt	(working copy)
@@ -3,6 +3,7 @@ add_llvm_library(LLVMipo
   BarrierNoopPass.cpp
   ConstantMerge.cpp
   DeadArgumentElimination.cpp
+  ElimAvailExtern.cpp
   ExtractGV.cpp
   FunctionAttrs.cpp
   GlobalDCE.cpp
Index: lib/Transforms/IPO/ElimAvailExtern.cpp
===================================================================
--- lib/Transforms/IPO/ElimAvailExtern.cpp	(revision 0)
+++ lib/Transforms/IPO/ElimAvailExtern.cpp	(working copy)
@@ -0,0 +1,90 @@
+//===-- ElimAvailExtern.cpp - DCE unreachable internal functions ----------------===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This transform is designed to eliminate available external global
+// definitions from the program, turning them into declarations.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/IPO.h"
+#include "llvm/ADT/Statistic.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Transforms/Utils/CtorUtils.h"
+#include "llvm/Transforms/Utils/GlobalStatus.h"
+#include "llvm/Pass.h"
+using namespace llvm;
+
+#define DEBUG_TYPE "elimavailextern"
+
+STATISTIC(NumAliases  , "Number of global aliases removed");
+STATISTIC(NumFunctions, "Number of functions removed");
+STATISTIC(NumVariables, "Number of global variables removed");
+
+namespace {
+  struct ElimAvailExtern : public ModulePass {
+    static char ID; // Pass identification, replacement for typeid
+    ElimAvailExtern() : ModulePass(ID) {
+      initializeElimAvailExternPass(*PassRegistry::getPassRegistry());
+    }
+
+    // run - Do the ElimAvailExtern pass on the specified module, optionally
+    // updating the specified callgraph to reflect the changes.
+    //
+    bool runOnModule(Module &M) override;
+  };
+}
+
+char ElimAvailExtern::ID = 0;
+INITIALIZE_PASS(ElimAvailExtern, "elimavailextern",
+                "Eliminate Available External Globals", false, false)
+
+ModulePass *llvm::createElimAvailExternPass() { return new ElimAvailExtern(); }
+
+bool ElimAvailExtern::runOnModule(Module &M) {
+  bool Changed = false;
+
+  // Drop initializers of available externally global variables.
+  for (Module::global_iterator I = M.global_begin(), E = M.global_end();
+       I != E; ++I) {
+    if (!I->hasAvailableExternallyLinkage())
+      continue;
+    if (I->hasInitializer()) {
+      Constant *Init = I->getInitializer();
+      I->setInitializer(nullptr);
+      if (isSafeToDestroyConstant(Init))
+        Init->destroyConstant();
+    }
+    I->removeDeadConstantUsers();
+    NumVariables++;
+  }
+
+  // Drop the bodies of available externally functions.
+  for (Module::iterator I = M.begin(), E = M.end(); I != E; ++I) {
+    if (!I->hasAvailableExternallyLinkage())
+      continue;
+    if (!I->isDeclaration())
+      I->deleteBody();
+    I->removeDeadConstantUsers();
+    NumFunctions++;
+  }
+
+  // Drop targets of available externally aliases.
+  for (Module::alias_iterator I = M.alias_begin(), E = M.alias_end(); I != E;
+       ++I) {
+    if (!I->hasAvailableExternallyLinkage())
+      continue;
+    I->setAliasee(nullptr);
+    I->removeDeadConstantUsers();
+    NumAliases++;
+  }
+
+  return Changed;
+}
Index: lib/Transforms/IPO/PassManagerBuilder.cpp
===================================================================
--- lib/Transforms/IPO/PassManagerBuilder.cpp	(revision 237590)
+++ lib/Transforms/IPO/PassManagerBuilder.cpp	(working copy)
@@ -106,6 +106,7 @@ PassManagerBuilder::PassManagerBuilder() {
     VerifyInput = false;
     VerifyOutput = false;
     MergeFunctions = false;
+    LTO = false;
 }
 
 PassManagerBuilder::~PassManagerBuilder() {
@@ -407,6 +408,16 @@ void PassManagerBuilder::populateModulePassManager
     // GlobalOpt already deletes dead functions and globals, at -O2 try a
     // late pass of GlobalDCE.  It is capable of deleting dead cycles.
     if (OptLevel > 1) {
+      if (!LTO) {
+        // Remove avail extern fns and globals definitions if we aren't
+        // doing an -flto compilation. For LTO we will preserve these
+        // so they are eligible for inlining at link-time. Note if they
+        // are unreferenced they will be removed by GlobalDCE below, so
+        // this only impacts referenced available externally globals.
+        // Eventually they will be suppressed during codegen, but eliminating
+        // here is a compile-time savings.
+        MPM.add(createElimAvailExternPass());
+      }
       MPM.add(createGlobalDCEPass());         // Remove dead fns and globals.
       MPM.add(createConstantMergePass());     // Merge dup global constants
     }
-------------- next part --------------
Index: include/clang/Driver/Options.td
===================================================================
--- include/clang/Driver/Options.td	(revision 237590)
+++ include/clang/Driver/Options.td	(working copy)
@@ -636,7 +636,7 @@ def flat__namespace : Flag<["-"], "flat_namespace"
 def flax_vector_conversions : Flag<["-"], "flax-vector-conversions">, Group<f_Group>;
 def flimited_precision_EQ : Joined<["-"], "flimited-precision=">, Group<f_Group>;
 def flto_EQ : Joined<["-"], "flto=">, Group<clang_ignored_gcc_optimization_f_Group>;
-def flto : Flag<["-"], "flto">, Group<f_Group>;
+def flto : Flag<["-"], "flto">, Flags<[CC1Option]>, Group<f_Group>;
 def fno_lto : Flag<["-"], "fno-lto">, Group<f_Group>;
 def fmacro_backtrace_limit_EQ : Joined<["-"], "fmacro-backtrace-limit=">,
                                 Group<f_Group>, Flags<[DriverOption, CoreOption]>;
Index: include/clang/Frontend/CodeGenOptions.def
===================================================================
--- include/clang/Frontend/CodeGenOptions.def	(revision 237590)
+++ include/clang/Frontend/CodeGenOptions.def	(working copy)
@@ -67,6 +67,7 @@ CODEGENOPT(InstrumentFunctions , 1, 0) ///< Set wh
 CODEGENOPT(InstrumentForProfiling , 1, 0) ///< Set when -pg is enabled.
 CODEGENOPT(LessPreciseFPMAD  , 1, 0) ///< Enable less precise MAD instructions to
                                      ///< be generated.
+CODEGENOPT(LTO               , 1, 0) ///< Set when -flto is enabled.
 CODEGENOPT(MergeAllConstants , 1, 1) ///< Merge identical constants.
 CODEGENOPT(MergeFunctions    , 1, 0) ///< Set when -fmerge-functions is enabled.
 CODEGENOPT(MSVolatile        , 1, 0) ///< Set when /volatile:ms is enabled.
Index: lib/CodeGen/BackendUtil.cpp
===================================================================
--- lib/CodeGen/BackendUtil.cpp	(revision 237590)
+++ lib/CodeGen/BackendUtil.cpp	(working copy)
@@ -287,6 +287,7 @@ void EmitAssemblyHelper::CreatePasses() {
   PMBuilder.DisableUnitAtATime = !CodeGenOpts.UnitAtATime;
   PMBuilder.DisableUnrollLoops = !CodeGenOpts.UnrollLoops;
   PMBuilder.MergeFunctions = CodeGenOpts.MergeFunctions;
+  PMBuilder.LTO = CodeGenOpts.LTO;
   PMBuilder.RerollLoops = CodeGenOpts.RerollLoops;
 
   PMBuilder.addExtension(PassManagerBuilder::EP_EarlyAsPossible,
Index: lib/Driver/Tools.cpp
===================================================================
--- lib/Driver/Tools.cpp	(revision 237590)
+++ lib/Driver/Tools.cpp	(working copy)
@@ -2676,6 +2676,10 @@ void Clang::ConstructJob(Compilation &C, const Job
     assert((isa<CompileJobAction>(JA) || isa<BackendJobAction>(JA)) &&
            "Invalid action for clang tool.");
 
+    if (JA.getType() == types::TY_LTO_IR ||
+        JA.getType() == types::TY_LTO_BC) {
+      CmdArgs.push_back("-flto");
+    }
     if (JA.getType() == types::TY_Nothing) {
       CmdArgs.push_back("-fsyntax-only");
     } else if (JA.getType() == types::TY_LLVM_IR ||
Index: lib/Frontend/CompilerInvocation.cpp
===================================================================
--- lib/Frontend/CompilerInvocation.cpp	(revision 237590)
+++ lib/Frontend/CompilerInvocation.cpp	(working copy)
@@ -489,6 +489,8 @@ static bool ParseCodeGenArgs(CodeGenOptions &Opts,
 
   Opts.MergeFunctions = Args.hasArg(OPT_fmerge_functions);
 
+  Opts.LTO = Args.hasArg(OPT_flto);
+
   Opts.MSVolatile = Args.hasArg(OPT_fms_volatile);
 
   Opts.VectorizeBB = Args.hasArg(OPT_vectorize_slp_aggressive);
Index: test/CodeGen/available-externally-suppress.c
===================================================================
--- test/CodeGen/available-externally-suppress.c	(revision 237590)
+++ test/CodeGen/available-externally-suppress.c	(working copy)
@@ -1,6 +1,10 @@
 // RUN: %clang_cc1 -emit-llvm -o - -triple x86_64-apple-darwin10 %s | FileCheck %s
+// RUN: %clang_cc1 -O2 -fno-inline -emit-llvm -o - -triple x86_64-apple-darwin10 %s | FileCheck %s
+// RUN: %clang_cc1 -flto -O2 -fno-inline -emit-llvm -o - -triple x86_64-apple-darwin10 %s | FileCheck %s -check-prefix=LTO
 
 // Ensure that we don't emit available_externally functions at -O0.
+// Also should not emit them at -O2, unless -flto is present in which case
+// we should preserve them for link-time inlining decisions.
 int x;
 
 inline void f0(int y) { x = y; }
@@ -7,6 +11,8 @@ inline void f0(int y) { x = y; }
 
 // CHECK-LABEL: define void @test()
 // CHECK: declare void @f0(i32)
+// LTO-LABEL: define void @test()
+// LTO: define available_externally void @f0
 void test() {
   f0(17);
 }
@@ -19,9 +25,13 @@ inline int __attribute__((always_inline)) f1(int x
 }
 
 // CHECK: @test1
+// LTO: @test1
 int test1(int x) {
   // CHECK: br i1
   // CHECK-NOT: call {{.*}} @f1
   // CHECK: ret i32
+  // LTO: br i1
+  // LTO-NOT: call {{.*}} @f1
+  // LTO: ret i32
   return f1(x);
 }


More information about the llvm-dev mailing list