[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)
Teresa Johnson
tejohnson at google.com
Mon Jun 8 11:46:48 PDT 2015
On Mon, Jun 8, 2015 at 11:33 AM, Reid Kleckner <rnk at google.com> wrote:
> The clang patch lgtm, and I had some comments on the LLVM patch. Duncan, do
> you want to say more there?
>
> --- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590)
> +++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy)
> @@ -121,6 +121,7 @@ class PassManagerBuilder {
> bool VerifyInput;
> bool VerifyOutput;
> bool MergeFunctions;
> + bool LTO;
>
> While in the context of clang it makes sense that "LTO" means "generate an
> object file appropriate for later LTO", but in the context of LLVM, it might
> be taken to mean "we are doing LTO optimizations now". I think we probably
> want to name the boolean something more specific (PreLTO? CompileForLTO?).
> Later if we customize further we can split the codepath as is done for
> populateLTOPassManager.
Yeah, I was concerned that the name might be confusing in this
context, so if we opt to go with a single llvm option to mean -flto I
like the idea of something like CompileForLTO (or maybe
PrepareForLTO?).
>
> + if (!LTO) {
> + // Remove avail extern fns and globals definitions if we aren't
> + // doing an -flto compilation. For LTO we will preserve these
>
> LLVM doesn't have an -flto flag, so I'd probably keep this generic to
> "compiling an object file for later LTO".
Ok
>
> + // so they are eligible for inlining at link-time. Note if they
> + // are unreferenced they will be removed by GlobalDCE below, so
> + // this only impacts referenced available externally globals.
> + // Eventually they will be suppressed during codegen, but
> eliminating
> + // here is a compile-time savings.
>
> More than the compile time savings, it actually increases the power of
> GlobalDCE and can reduce object file size, so I'd mention that instead.
Ok, right in your case that is the benefit. Will update the comment.
>
> + MPM.add(createElimAvailExternPass());
> + }
>
> + // Drop initializers of available externally global variables.
> + for (Module::global_iterator I = M.global_begin(), E = M.global_end();
> + I != E; ++I) {
> + if (!I->hasAvailableExternallyLinkage())
> + continue;
> + if (I->hasInitializer()) {
> + Constant *Init = I->getInitializer();
> + I->setInitializer(nullptr);
> + if (isSafeToDestroyConstant(Init))
> + Init->destroyConstant();
> + }
> + I->removeDeadConstantUsers();
> + NumVariables++;
> + }
>
> Do you need to set the linkage of global variables to external? I noticed
> that Function::deleteBody() does this but setInitializer(nullptr) does not.
> Ditto for aliases.
Hmm, I am not sure - I basically am doing the same things GlobalDCE
does (aside from erasing them from the function/variable/alias list).
I guess it didn't matter in the GlobalDCE case since they were being
completely erased rather than leaving a decl behind. Will make that
change and retest.
Thanks!
Teresa
>
> On Mon, Jun 8, 2015 at 9:03 AM, Teresa Johnson <tejohnson at google.com> wrote:
>>
>> Talked to Eric Fri and he suggested that this might be the first of
>> several places where we want behavior of LTO compiles to diverge from
>> normal -O2 compiles. So for now I have implemented this such that we
>> pass down -flto to the -cc1 job, and that gets propagated as a code
>> gen option and into the PassManagerBuilder. I have left the current
>> logic translating -flto to the -emit-llvm-bc option, since IMO that is
>> cleaner for controlling the output type. Let me know what you think of
>> saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or
>> if is it better to record just the individual behaviors we want (i.e.
>> change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to
>> "ElimAvailExtern" or something like that).
>>
>> Patches attached. I modified an existing available external clang test
>> to check for the new O2 (LTO vs not) behaviors.
>>
>> Thanks,
>> Teresa
>>
>> On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com>
>> wrote:
>> >
>> >
>> > On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com>
>> > wrote:
>> >>
>> >> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com> wrote:
>> >> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson <tejohnson at google.com>
>> >> > wrote:
>> >> >>
>> >> >> Agreed. Although I assume you mean invoke the new pass under a
>> >> >> ThinLTO-only option so that avail extern are not dropped in the
>> >> >> compile pass before the LTO link?
>> >> >
>> >> >
>> >> > No, this pass would actually be an improvement to the standard -O2
>> >> > pipeline.
>> >> > The special case is the regular LTO pass pipeline, which wants to run
>> >> > GlobalDCE but doesn't want to drop available_externally function
>> >> > definitions
>> >> > until after linker-stage inlining.
>> >>
>> >> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses
>> >> this same -O2 optimization pipeline as without LTO. Clang communicates
>> >> the fact that we are doing an LTO compile to llvm via the
>> >> -emit-llvm-bc flag, which just tells it to emit bitcode and exit
>> >> before codegen. So I would either need to key off of that or setup a
>> >> new flag to indicate to llvm that we are doing an LTO -c compile. Or
>> >> is there some other way that I am missing?
>> >>
>> >> Incidentally, we'll also want to skip this new pass and keep any
>> >> referenced avail extern functions in the ThinLTO -c compile step for
>> >> the same reasons (and there are no imported functions yet at that
>> >> point).
>> >>
>> >
>> > Ultimately for any planned LTO build we're going to want to do a
>> > different
>> > pass pipeline, it's probably worth considering what should be done
>> > before
>> > and during LTO.
>> >
>> > -eric
>> >
>> >>
>> >> Teresa
>> >>
>> >> >
>> >> > Consider this test case:
>> >> >
>> >> > declare void @blah()
>> >> > define i32 @main() {
>> >> > call void @foo()
>> >> > ret i32 0
>> >> > }
>> >> > define available_externally void @foo() noinline {
>> >> > call void @bar()
>> >> > ret void
>> >> > }
>> >> > define linkonce_odr void @bar() noinline {
>> >> > call void @blah()
>> >> > ret void
>> >> > }
>> >> >
>> >> > If I run opt -O2 on this and feed it to llc, it actually generates
>> >> > code
>> >> > for
>> >> > bar, even though there are no actual references to bar in the final
>> >> > code:
>> >> >
>> >> > main: # @main
>> >> > pushq %rax
>> >> > callq foo
>> >> > xorl %eax, %eax
>> >> > popq %rdx
>> >> > retq
>> >> >
>> >> > bar: # @bar
>> >> > jmp blah # TAILCALL
>> >> >
>> >> > This corner case happens to come up organically with dllimported
>> >> > classes,
>> >> > which is why I happened to think of it. :) I'm happy with a flag to
>> >> > globaldce for LTO and the original patch, honestly.
>> >>
>> >>
>> >>
>> >> --
>> >> Teresa Johnson | Software Engineer | tejohnson at google.com |
>> >> 408-460-2413
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
>
>
--
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
More information about the llvm-dev
mailing list