[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)

Teresa Johnson tejohnson at google.com
Mon Jun 8 11:46:48 PDT 2015


On Mon, Jun 8, 2015 at 11:33 AM, Reid Kleckner <rnk at google.com> wrote:
> The clang patch lgtm, and I had some comments on the LLVM patch. Duncan, do
> you want to say more there?
>
> --- include/llvm/Transforms/IPO/PassManagerBuilder.h (revision 237590)
> +++ include/llvm/Transforms/IPO/PassManagerBuilder.h (working copy)
> @@ -121,6 +121,7 @@ class PassManagerBuilder {
>    bool VerifyInput;
>    bool VerifyOutput;
>    bool MergeFunctions;
> +  bool LTO;
>
> While in the context of clang it makes sense that "LTO" means "generate an
> object file appropriate for later LTO", but in the context of LLVM, it might
> be taken to mean "we are doing LTO optimizations now". I think we probably
> want to name the boolean something more specific (PreLTO? CompileForLTO?).
> Later if we customize further we can split the codepath as is done for
> populateLTOPassManager.

Yeah, I was concerned that the name might be confusing in this
context, so if we opt to go with a single llvm option to mean -flto I
like the idea of something like CompileForLTO (or maybe
PrepareForLTO?).

>
> +      if (!LTO) {
> +        // Remove avail extern fns and globals definitions if we aren't
> +        // doing an -flto compilation. For LTO we will preserve these
>
> LLVM doesn't have an -flto flag, so I'd probably keep this generic to
> "compiling an object file for later LTO".

Ok

>
> +        // so they are eligible for inlining at link-time. Note if they
> +        // are unreferenced they will be removed by GlobalDCE below, so
> +        // this only impacts referenced available externally globals.
> +        // Eventually they will be suppressed during codegen, but
> eliminating
> +        // here is a compile-time savings.
>
> More than the compile time savings, it actually increases the power of
> GlobalDCE and can reduce object file size, so I'd mention that instead.

Ok, right in your case that is the benefit. Will update the comment.

>
> +        MPM.add(createElimAvailExternPass());
> +      }
>
> +  // Drop initializers of available externally global variables.
> +  for (Module::global_iterator I = M.global_begin(), E = M.global_end();
> +       I != E; ++I) {
> +    if (!I->hasAvailableExternallyLinkage())
> +      continue;
> +    if (I->hasInitializer()) {
> +      Constant *Init = I->getInitializer();
> +      I->setInitializer(nullptr);
> +      if (isSafeToDestroyConstant(Init))
> +        Init->destroyConstant();
> +    }
> +    I->removeDeadConstantUsers();
> +    NumVariables++;
> +  }
>
> Do you need to set the linkage of global variables to external? I noticed
> that Function::deleteBody() does this but setInitializer(nullptr) does not.
> Ditto for aliases.

Hmm, I am not sure - I basically am doing the same things GlobalDCE
does (aside from erasing them from the function/variable/alias list).
I guess it didn't matter in the GlobalDCE case since they were being
completely erased rather than leaving a decl behind. Will make that
change and retest.

Thanks!
Teresa

>
> On Mon, Jun 8, 2015 at 9:03 AM, Teresa Johnson <tejohnson at google.com> wrote:
>>
>> Talked to Eric Fri and he suggested that this might be the first of
>> several places where we want behavior of LTO compiles to diverge from
>> normal -O2 compiles. So for now I have implemented this such that we
>> pass down -flto to the -cc1 job, and that gets propagated as a code
>> gen option and into the PassManagerBuilder. I have left the current
>> logic translating -flto to the -emit-llvm-bc option, since IMO that is
>> cleaner for controlling the output type. Let me know what you think of
>> saving the overall LTO setting in CodeGenOpts/PassManagerBuilder, or
>> if is it better to record just the individual behaviors we want (i.e.
>> change the new field in CodeGenOpts/PassManagerBuilder from "LTO" to
>> "ElimAvailExtern" or something like that).
>>
>> Patches attached. I modified an existing available external clang test
>> to check for the new O2 (LTO vs not) behaviors.
>>
>> Thanks,
>> Teresa
>>
>> On Thu, Jun 4, 2015 at 9:20 PM, Eric Christopher <echristo at gmail.com>
>> wrote:
>> >
>> >
>> > On Thu, Jun 4, 2015 at 8:17 PM Teresa Johnson <tejohnson at google.com>
>> > wrote:
>> >>
>> >> On Thu, Jun 4, 2015 at 5:33 PM, Reid Kleckner <rnk at google.com> wrote:
>> >> > On Thu, Jun 4, 2015 at 5:17 PM, Teresa Johnson <tejohnson at google.com>
>> >> > wrote:
>> >> >>
>> >> >> Agreed. Although I assume you mean invoke the new pass under a
>> >> >> ThinLTO-only option so that avail extern are not dropped in the
>> >> >> compile pass before the LTO link?
>> >> >
>> >> >
>> >> > No, this pass would actually be an improvement to the standard -O2
>> >> > pipeline.
>> >> > The special case is the regular LTO pass pipeline, which wants to run
>> >> > GlobalDCE but doesn't want to drop available_externally function
>> >> > definitions
>> >> > until after linker-stage inlining.
>> >>
>> >> Ok. It looks to me like the LTO compile step with -O2 -flto -c uses
>> >> this same -O2 optimization pipeline as without LTO. Clang communicates
>> >> the fact that we are doing an LTO compile to llvm via the
>> >> -emit-llvm-bc flag, which just tells it to emit bitcode and exit
>> >> before codegen. So I would either need to key off of that or setup a
>> >> new flag to indicate to llvm that we are doing an LTO -c compile. Or
>> >> is there some other way that I am missing?
>> >>
>> >> Incidentally, we'll also want to skip this new pass and keep any
>> >> referenced avail extern functions in the ThinLTO -c compile step for
>> >> the same reasons (and there are no imported functions yet at that
>> >> point).
>> >>
>> >
>> > Ultimately for any planned LTO build we're going to want to do a
>> > different
>> > pass pipeline, it's probably worth considering what should be done
>> > before
>> > and during LTO.
>> >
>> > -eric
>> >
>> >>
>> >> Teresa
>> >>
>> >> >
>> >> > Consider this test case:
>> >> >
>> >> > declare void @blah()
>> >> > define i32 @main() {
>> >> >   call void @foo()
>> >> >   ret i32 0
>> >> > }
>> >> > define available_externally void @foo() noinline {
>> >> >   call void @bar()
>> >> >   ret void
>> >> > }
>> >> > define linkonce_odr void @bar() noinline {
>> >> >   call void @blah()
>> >> >   ret void
>> >> > }
>> >> >
>> >> > If I run opt -O2 on this and feed it to llc, it actually generates
>> >> > code
>> >> > for
>> >> > bar, even though there are no actual references to bar in the final
>> >> > code:
>> >> >
>> >> > main:                                   # @main
>> >> >         pushq   %rax
>> >> >         callq   foo
>> >> >         xorl    %eax, %eax
>> >> >         popq    %rdx
>> >> >         retq
>> >> >
>> >> > bar:                                    # @bar
>> >> >         jmp     blah                    # TAILCALL
>> >> >
>> >> > This corner case happens to come up organically with dllimported
>> >> > classes,
>> >> > which is why I happened to think of it. :) I'm happy with a flag to
>> >> > globaldce for LTO and the original patch, honestly.
>> >>
>> >>
>> >>
>> >> --
>> >> Teresa Johnson | Software Engineer | tejohnson at google.com |
>> >> 408-460-2413
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
>
>



-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413



More information about the llvm-dev mailing list