[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
Eric Christopher
echristo at gmail.com
Mon Jun 8 13:08:06 PDT 2015
On Mon, Jun 8, 2015 at 11:46 AM Teresa Johnson <tejohnson at google.com> wrote:
> On Mon, Jun 8, 2015 at 11:33 AM, Reid Kleckner <rnk at google.com> wrote:
> > The clang patch lgtm, and I had some comments on the LLVM patch. Duncan,
> do
> > you want to say more there?
> >
> > --- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590)
> > +++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy)
> > @@ -121,6 +121,7 @@ class PassManagerBuilder {
> > bool VerifyInput;
> > bool VerifyOutput;
> > bool MergeFunctions;
> > + bool LTO;
> >
> > While in the context of clang it makes sense that "LTO" means "generate
> an
> > object file appropriate for later LTO", but in the context of LLVM, it
> might
> > be taken to mean "we are doing LTO optimizations now". I think we
> probably
> > want to name the boolean something more specific (PreLTO?
> CompileForLTO?).
> > Later if we customize further we can split the codepath as is done for
> > populateLTOPassManager.
>
> Yeah, I was concerned that the name might be confusing in this
> context, so if we opt to go with a single llvm option to mean -flto I
> like the idea of something like CompileForLTO (or maybe
> PrepareForLTO?).
>
>
I'd rather not have it be an llvm option at all. Just construct a different
set of passes...
This would also solve the problem of needing multiple sets of options to be
passed to the builder. It'd be a bit of a change (i.e. having clang do the
pass setup), but I think it'd be worth it to have clang be able to
explicitly say which passes it wants.
It's also a major change and would probably need to be discussed more
widely so for now I guess this is sorta ok.
> > More than the compile time savings, it actually increases the power of
> > GlobalDCE and can reduce object file size, so I'd mention that instead.
>
> Ok, right in your case that is the benefit. Will update the comment.
>
>
In yours too, just by keeping the amount of IR smaller :)
-eric
> >
> > + MPM.add(createElimAvailExternPass());
> > + }
> >
> > + // Drop initializers of available externally global variables.
> > + for (Module::global_iterator I = M.global_begin(), E = M.global_end();
> > + I != E; ++I) {
> > + if (!I->hasAvailableExternallyLinkage())
> > + continue;
> > + if (I->hasInitializer()) {
> > + Constant *Init = I->getInitializer();
> > + I->setInitializer(nullptr);
> > + if (isSafeToDestroyConstant(Init))
> > + Init->destroyConstant();
> > + }
> > + I->removeDeadConstantUsers();
> > + NumVariables++;
> > + }
> >
> > Do you need to set the linkage of global variables to external? I noticed
> > that Function::deleteBody() does this but setInitializer(nullptr) does
> not.
> > Ditto for aliases.
>
> Hmm, I am not sure - I basically am doing the same things GlobalDCE
> does (aside from erasing them from the function/variable/alias list).
> I guess it didn't matter in the GlobalDCE case since they were being
> completely erased rather than leaving a decl behind. Will make that
> change and retest.
>
> Thanks!
> Teresa
>
> >
> > On Mon, Jun 8, 2015 at 9:03 AM, Teresa Johnson <tejohnson at google.com>
> wrote:
> >>
> >> Talked to Eric Fri and he suggested that this might be the first of
> >> several places where we want behavior of LTO compiles to diverge from
> >> normal -O2 compiles. So for now I have implemented this such that we
> >> pass down -flto to the -cc1 job, and that gets propagated as a code
> >> gen option and into the PassManagerBuilder. I have left the current
> >> logic translating -flto to the -emit-llvm-bc option, since IMO that is
> >> cleaner for controlling the output type. Let me know what you think of
> >> saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or
> >> if is it better to record just the individual behaviors we want (i.e.
> >> change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to
> >> "ElimAvailExtern" or something like that).
> >>
> >> Patches attached. I modified an existing available external clang test
> >> to check for the new O2 (LTO vs not) behaviors.
> >>
> >> Thanks,
> >> Teresa
> >>
> >> On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com>
> >> wrote:
> >> >
> >> >
> >> > On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com>
> >> > wrote:
> >> >>
> >> >> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com>
> wrote:
> >> >> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson <
> tejohnson at google.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Agreed. Although I assume you mean invoke the new pass under a
> >> >> >> ThinLTO-only option so that avail extern are not dropped in the
> >> >> >> compile pass before the LTO link?
> >> >> >
> >> >> >
> >> >> > No, this pass would actually be an improvement to the standard -O2
> >> >> > pipeline.
> >> >> > The special case is the regular LTO pass pipeline, which wants to
> run
> >> >> > GlobalDCE but doesn't want to drop available_externally function
> >> >> > definitions
> >> >> > until after linker-stage inlining.
> >> >>
> >> >> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses
> >> >> this same -O2 optimization pipeline as without LTO. Clang
> communicates
> >> >> the fact that we are doing an LTO compile to llvm via the
> >> >> -emit-llvm-bc flag, which just tells it to emit bitcode and exit
> >> >> before codegen. So I would either need to key off of that or setup a
> >> >> new flag to indicate to llvm that we are doing an LTO -c compile. Or
> >> >> is there some other way that I am missing?
> >> >>
> >> >> Incidentally, we'll also want to skip this new pass and keep any
> >> >> referenced avail extern functions in the ThinLTO -c compile step for
> >> >> the same reasons (and there are no imported functions yet at that
> >> >> point).
> >> >>
> >> >
> >> > Ultimately for any planned LTO build we're going to want to do a
> >> > different
> >> > pass pipeline, it's probably worth considering what should be done
> >> > before
> >> > and during LTO.
> >> >
> >> > -eric
> >> >
> >> >>
> >> >> Teresa
> >> >>
> >> >> >
> >> >> > Consider this test case:
> >> >> >
> >> >> > declare void @blah()
> >> >> > define i32 @main() {
> >> >> > call void @foo()
> >> >> > ret i32 0
> >> >> > }
> >> >> > define available_externally void @foo() noinline {
> >> >> > call void @bar()
> >> >> > ret void
> >> >> > }
> >> >> > define linkonce_odr void @bar() noinline {
> >> >> > call void @blah()
> >> >> > ret void
> >> >> > }
> >> >> >
> >> >> > If I run opt -O2 on this and feed it to llc, it actually generates
> >> >> > code
> >> >> > for
> >> >> > bar, even though there are no actual references to bar in the final
> >> >> > code:
> >> >> >
> >> >> > main: # @main
> >> >> > pushq %rax
> >> >> > callq foo
> >> >> > xorl %eax, %eax
> >> >> > popq %rdx
> >> >> > retq
> >> >> >
> >> >> > bar: # @bar
> >> >> > jmp blah # TAILCALL
> >> >> >
> >> >> > This corner case happens to come up organically with dllimported
> >> >> > classes,
> >> >> > which is why I happened to think of it. :) I'm happy with a flag to
> >> >> > globaldce for LTO and the original patch, honestly.
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Teresa Johnson | Software Engineer | tejohnson at google.com |
> >> >> 408-460-2413
> >> >> _______________________________________________
> >> >> LLVM Developers mailing list
> >> >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> >> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>
> >>
> >>
> >> --
> >> Teresa Johnson | Software Engineer | tejohnson at google.com |
> 408-460-2413
> >
> >
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150608/2c055b9a/attachment.html>
More information about the llvm-dev
mailing list