[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
Teresa Johnson
tejohnson at google.com
Mon Jun 8 09:03:20 PDT 2015
Talked to Eric Fri and he suggested that this might be the first of
several places where we want behavior of LTO compiles to diverge from
normal -O2 compiles. So for now I have implemented this such that we
pass down -flto to the -cc1 job, and that gets propagated as a code
gen option and into the PassManagerBuilder. I have left the current
logic translating -flto to the -emit-llvm-bc option, since IMO that is
cleaner for controlling the output type. Let me know what you think of
saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or
if is it better to record just the individual behaviors we want (i.e.
change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to
"ElimAvailExtern" or something like that).
Patches attached. I modified an existing available external clang test
to check for the new O2 (LTO vs not) behaviors.
Thanks,
Teresa
On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com> wrote:
>
>
> On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com> wrote:
>>
>> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com> wrote:
>> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson <tejohnson at google.com>
>> > wrote:
>> >>
>> >> Agreed. Although I assume you mean invoke the new pass under a
>> >> ThinLTO-only option so that avail extern are not dropped in the
>> >> compile pass before the LTO link?
>> >
>> >
>> > No, this pass would actually be an improvement to the standard -O2
>> > pipeline.
>> > The special case is the regular LTO pass pipeline, which wants to run
>> > GlobalDCE but doesn't want to drop available_externally function
>> > definitions
>> > until after linker-stage inlining.
>>
>> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses
>> this same -O2 optimization pipeline as without LTO. Clang communicates
>> the fact that we are doing an LTO compile to llvm via the
>> -emit-llvm-bc flag, which just tells it to emit bitcode and exit
>> before codegen. So I would either need to key off of that or setup a
>> new flag to indicate to llvm that we are doing an LTO -c compile. Or
>> is there some other way that I am missing?
>>
>> Incidentally, we'll also want to skip this new pass and keep any
>> referenced avail extern functions in the ThinLTO -c compile step for
>> the same reasons (and there are no imported functions yet at that
>> point).
>>
>
> Ultimately for any planned LTO build we're going to want to do a different
> pass pipeline, it's probably worth considering what should be done before
> and during LTO.
>
> -eric
>
>>
>> Teresa
>>
>> >
>> > Consider this test case:
>> >
>> > declare void @blah()
>> > define i32 @main() {
>> > call void @foo()
>> > ret i32 0
>> > }
>> > define available_externally void @foo() noinline {
>> > call void @bar()
>> > ret void
>> > }
>> > define linkonce_odr void @bar() noinline {
>> > call void @blah()
>> > ret void
>> > }
>> >
>> > If I run opt -O2 on this and feed it to llc, it actually generates code
>> > for
>> > bar, even though there are no actual references to bar in the final
>> > code:
>> >
>> > main: # @main
>> > pushq %rax
>> > callq foo
>> > xorl %eax, %eax
>> > popq %rdx
>> > retq
>> >
>> > bar: # @bar
>> > jmp blah # TAILCALL
>> >
>> > This corner case happens to come up organically with dllimported
>> > classes,
>> > which is why I happened to think of it. :) I'm happy with a flag to
>> > globaldce for LTO and the original patch, honestly.
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
--
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
-------------- next part --------------
Index: include/llvm/InitializePasses.h
===================================================================
--- include/llvm/InitializePasses.h (revision 237590)
+++ include/llvm/InitializePasses.h (working copy)
@@ -130,6 +130,7 @@ void initializeSanitizerCoverageModulePass(PassReg
void initializeDataFlowSanitizerPass(PassRegistry&);
void initializeScalarizerPass(PassRegistry&);
void initializeEarlyCSELegacyPassPass(PassRegistry &);
+void initializeElimAvailExternPass(PassRegistry&);
void initializeExpandISelPseudosPass(PassRegistry&);
void initializeFunctionAttrsPass(PassRegistry&);
void initializeGCMachineCodeAnalysisPass(PassRegistry&);
Index: include/llvm/Transforms/IPO.h
===================================================================
--- include/llvm/Transforms/IPO.h (revision 237590)
+++ include/llvm/Transforms/IPO.h (working copy)
@@ -71,6 +71,12 @@ ModulePass *createGlobalOptimizerPass();
ModulePass *createGlobalDCEPass();
//===----------------------------------------------------------------------===//
+/// createElimAvailExternPass - This transform is designed to eliminate
+/// available external globals (functions or global variables)
+///
+ModulePass *createElimAvailExternPass();
+
+//===----------------------------------------------------------------------===//
/// createGVExtractionPass - If deleteFn is true, this pass deletes
/// the specified global values. Otherwise, it deletes as much of the module as
/// possible, except for the global values specified.
Index: include/llvm/Transforms/IPO/PassManagerBuilder.h
===================================================================
--- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590)
+++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy)
@@ -121,6 +121,7 @@ class PassManagerBuilder {
bool VerifyInput;
bool VerifyOutput;
bool MergeFunctions;
+ bool LTO;
private:
/// ExtensionList - This is list of all of the extensions that are registered.
Index: lib/Transforms/IPO/CMakeLists.txt
===================================================================
--- lib/Transforms/IPO/CMakeLists.txt (revision 237590)
+++ lib/Transforms/IPO/CMakeLists.txt (working copy)
@@ -3,6 +3,7 @@ add_llvm_library(LLVMipo
BarrierNoopPass.cpp
ConstantMerge.cpp
DeadArgumentElimination.cpp
+ ElimAvailExtern.cpp
ExtractGV.cpp
FunctionAttrs.cpp
GlobalDCE.cpp
Index: lib/Transforms/IPO/ElimAvailExtern.cpp
===================================================================
--- lib/Transforms/IPO/ElimAvailExtern.cpp (revision 0)
+++ lib/Transforms/IPO/ElimAvailExtern.cpp (working copy)
@@ -0,0 +1,90 @@
+//===-- ElimAvailExtern.cpp - DCE unreachable internal functions ----------------===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This transform is designed to eliminate available external global
+// definitions from the program, turning them into declarations.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/IPO.h"
+#include "llvm/ADT/Statistic.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Transforms/Utils/CtorUtils.h"
+#include "llvm/Transforms/Utils/GlobalStatus.h"
+#include "llvm/Pass.h"
+using namespace llvm;
+
+#define DEBUG_TYPE "elimavailextern"
+
+STATISTIC(NumAliases , "Number of global aliases removed");
+STATISTIC(NumFunctions, "Number of functions removed");
+STATISTIC(NumVariables, "Number of global variables removed");
+
+namespace {
+ struct ElimAvailExtern : public ModulePass {
+ static char ID; // Pass identification, replacement for typeid
+ ElimAvailExtern() : ModulePass(ID) {
+ initializeElimAvailExternPass(*PassRegistry::getPassRegistry());
+ }
+
+ // run - Do the ElimAvailExtern pass on the specified module, optionally
+ // updating the specified callgraph to reflect the changes.
+ //
+ bool runOnModule(Module &M) override;
+ };
+}
+
+char ElimAvailExtern::ID = 0;
+INITIALIZE_PASS(ElimAvailExtern, "elimavailextern",
+ "Eliminate Available External Globals", false, false)
+
+ModulePass *llvm::createElimAvailExternPass() { return new ElimAvailExtern(); }
+
+bool ElimAvailExtern::runOnModule(Module &M) {
+ bool Changed = false;
+
+ // Drop initializers of available externally global variables.
+ for (Module::global_iterator I = M.global_begin(), E = M.global_end();
+ I != E; ++I) {
+ if (!I->hasAvailableExternallyLinkage())
+ continue;
+ if (I->hasInitializer()) {
+ Constant *Init = I->getInitializer();
+ I->setInitializer(nullptr);
+ if (isSafeToDestroyConstant(Init))
+ Init->destroyConstant();
+ }
+ I->removeDeadConstantUsers();
+ NumVariables++;
+ }
+
+ // Drop the bodies of available externally functions.
+ for (Module::iterator I = M.begin(), E = M.end(); I != E; ++I) {
+ if (!I->hasAvailableExternallyLinkage())
+ continue;
+ if (!I->isDeclaration())
+ I->deleteBody();
+ I->removeDeadConstantUsers();
+ NumFunctions++;
+ }
+
+ // Drop targets of available externally aliases.
+ for (Module::alias_iterator I = M.alias_begin(), E = M.alias_end(); I != E;
+ ++I) {
+ if (!I->hasAvailableExternallyLinkage())
+ continue;
+ I->setAliasee(nullptr);
+ I->removeDeadConstantUsers();
+ NumAliases++;
+ }
+
+ return Changed;
+}
Index: lib/Transforms/IPO/PassManagerBuilder.cpp
===================================================================
--- lib/Transforms/IPO/PassManagerBuilder.cpp (revision 237590)
+++ lib/Transforms/IPO/PassManagerBuilder.cpp (working copy)
@@ -106,6 +106,7 @@ PassManagerBuilder::PassManagerBuilder() {
VerifyInput = false;
VerifyOutput = false;
MergeFunctions = false;
+ LTO = false;
}
PassManagerBuilder::~PassManagerBuilder() {
@@ -407,6 +408,16 @@ void PassManagerBuilder::populateModulePassManager
// GlobalOpt already deletes dead functions and globals, at -O2 try a
// late pass of GlobalDCE. It is capable of deleting dead cycles.
if (OptLevel > 1) {
+ if (!LTO) {
+ // Remove avail extern fns and globals definitions if we aren't
+ // doing an -flto compilation. For LTO we will preserve these
+ // so they are eligible for inlining at link-time. Note if they
+ // are unreferenced they will be removed by GlobalDCE below, so
+ // this only impacts referenced available externally globals.
+ // Eventually they will be suppressed during codegen, but eliminating
+ // here is a compile-time savings.
+ MPM.add(createElimAvailExternPass());
+ }
MPM.add(createGlobalDCEPass()); // Remove dead fns and globals.
MPM.add(createConstantMergePass()); // Merge dup global constants
}
-------------- next part --------------
Index: include/clang/Driver/Options.td
===================================================================
--- include/clang/Driver/Options.td (revision 237590)
+++ include/clang/Driver/Options.td (working copy)
@@ -636,7 +636,7 @@ def flat__namespace : Flag<["-"], "flat_namespace"
def flax_vector_conversions : Flag<["-"], "flax-vector-conversions">, Group<f_Group>;
def flimited_precision_EQ : Joined<["-"], "flimited-precision=">, Group<f_Group>;
def flto_EQ : Joined<["-"], "flto=">, Group<clang_ignored_gcc_optimization_f_Group>;
-def flto : Flag<["-"], "flto">, Group<f_Group>;
+def flto : Flag<["-"], "flto">, Flags<[CC1Option]>, Group<f_Group>;
def fno_lto : Flag<["-"], "fno-lto">, Group<f_Group>;
def fmacro_backtrace_limit_EQ : Joined<["-"], "fmacro-backtrace-limit=">,
Group<f_Group>, Flags<[DriverOption, CoreOption]>;
Index: include/clang/Frontend/CodeGenOptions.def
===================================================================
--- include/clang/Frontend/CodeGenOptions.def (revision 237590)
+++ include/clang/Frontend/CodeGenOptions.def (working copy)
@@ -67,6 +67,7 @@ CODEGENOPT(InstrumentFunctions , 1, 0) ///< Set wh
CODEGENOPT(InstrumentForProfiling , 1, 0) ///< Set when -pg is enabled.
CODEGENOPT(LessPreciseFPMAD , 1, 0) ///< Enable less precise MAD instructions to
///< be generated.
+CODEGENOPT(LTO , 1, 0) ///< Set when -flto is enabled.
CODEGENOPT(MergeAllConstants , 1, 1) ///< Merge identical constants.
CODEGENOPT(MergeFunctions , 1, 0) ///< Set when -fmerge-functions is enabled.
CODEGENOPT(MSVolatile , 1, 0) ///< Set when /volatile:ms is enabled.
Index: lib/CodeGen/BackendUtil.cpp
===================================================================
--- lib/CodeGen/BackendUtil.cpp (revision 237590)
+++ lib/CodeGen/BackendUtil.cpp (working copy)
@@ -287,6 +287,7 @@ void EmitAssemblyHelper::CreatePasses() {
PMBuilder.DisableUnitAtATime = !CodeGenOpts.UnitAtATime;
PMBuilder.DisableUnrollLoops = !CodeGenOpts.UnrollLoops;
PMBuilder.MergeFunctions = CodeGenOpts.MergeFunctions;
+ PMBuilder.LTO = CodeGenOpts.LTO;
PMBuilder.RerollLoops = CodeGenOpts.RerollLoops;
PMBuilder.addExtension(PassManagerBuilder::EP_EarlyAsPossible,
Index: lib/Driver/Tools.cpp
===================================================================
--- lib/Driver/Tools.cpp (revision 237590)
+++ lib/Driver/Tools.cpp (working copy)
@@ -2676,6 +2676,10 @@ void Clang::ConstructJob(Compilation &C, const Job
assert((isa<CompileJobAction>(JA) || isa<BackendJobAction>(JA)) &&
"Invalid action for clang tool.");
+ if (JA.getType() == types::TY_LTO_IR ||
+ JA.getType() == types::TY_LTO_BC) {
+ CmdArgs.push_back("-flto");
+ }
if (JA.getType() == types::TY_Nothing) {
CmdArgs.push_back("-fsyntax-only");
} else if (JA.getType() == types::TY_LLVM_IR ||
Index: lib/Frontend/CompilerInvocation.cpp
===================================================================
--- lib/Frontend/CompilerInvocation.cpp (revision 237590)
+++ lib/Frontend/CompilerInvocation.cpp (working copy)
@@ -489,6 +489,8 @@ static bool ParseCodeGenArgs(CodeGenOptions &Opts,
Opts.MergeFunctions = Args.hasArg(OPT_fmerge_functions);
+ Opts.LTO = Args.hasArg(OPT_flto);
+
Opts.MSVolatile = Args.hasArg(OPT_fms_volatile);
Opts.VectorizeBB = Args.hasArg(OPT_vectorize_slp_aggressive);
Index: test/CodeGen/available-externally-suppress.c
===================================================================
--- test/CodeGen/available-externally-suppress.c (revision 237590)
+++ test/CodeGen/available-externally-suppress.c (working copy)
@@ -1,6 +1,10 @@
// RUN: %clang_cc1 -emit-llvm -o - -triple x86_64-apple-darwin10 %s | FileCheck %s
+// RUN: %clang_cc1 -O2 -fno-inline -emit-llvm -o - -triple x86_64-apple-darwin10 %s | FileCheck %s
+// RUN: %clang_cc1 -flto -O2 -fno-inline -emit-llvm -o - -triple x86_64-apple-darwin10 %s | FileCheck %s -check-prefix=LTO
// Ensure that we don't emit available_externally functions at -O0.
+// Also should not emit them at -O2, unless -flto is present in which case
+// we should preserve them for link-time inlining decisions.
int x;
inline void f0(int y) { x = y; }
@@ -7,6 +11,8 @@ inline void f0(int y) { x = y; }
// CHECK-LABEL: define void @test()
// CHECK: declare void @f0(i32)
+// LTO-LABEL: define void @test()
+// LTO: define available_externally void @f0
void test() {
f0(17);
}
@@ -19,9 +25,13 @@ inline int __attribute__((always_inline)) f1(int x
}
// CHECK: @test1
+// LTO: @test1
int test1(int x) {
// CHECK: br i1
// CHECK-NOT: call {{.*}} @f1
// CHECK: ret i32
+ // LTO: br i1
+ // LTO-NOT: call {{.*}} @f1
+ // LTO: ret i32
return f1(x);
}
More information about the llvm-dev
mailing list