[PATCH] D12205: LTO: Simplify merged module ownership.

Gao, Yunzhong via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 31 15:27:22 PDT 2015


Hi Peter,

> The only other linker I am aware of that uses libLTO is Sony's. Yunzhong,
> would this change affect you significantly?

I looked at diffs version#32721, version#32998 and the final committed version#33015 (r245891), and I
do not think these changes will cause problems for our linker. The diffs versions refer to the IDs given
at the Phabricator website, e.g., http://reviews.llvm.org/D12205?id=32721.
Cheers,
- Gao


> -----Original Message-----
> From: Peter Collingbourne [mailto:peter at pcc.me.uk]
> Sent: Thursday, August 20, 2015 10:04 PM
> To: Duncan P. N. Exon Smith; Gao, Yunzhong
> Cc: reviews+D12205+public+5a8f0eb335c69738 at reviews.llvm.org;
> mehdi.amini at apple.com; llvm-commits at lists.llvm.org
> Subject: Re: [PATCH] D12205: LTO: Simplify merged module ownership.
> 
> On Thu, Aug 20, 2015 at 05:25:32PM -0700, Duncan P. N. Exon Smith wrote:
> > > --- tools/llvm-lto/llvm-lto.cpp
> > > +++ tools/llvm-lto/llvm-lto.cpp
> > > @@ -207,26 +207,24 @@
> > >        return 1;
> > >      }
> > >
> > > -    LTOModule *LTOMod = Module.get();
> > > -
> > > -    // We use the first input module as the destination module when
> > > -    // SetMergedModule is true.
> > > -    if (SetMergedModule && i == BaseArg) {
> > > -      // Transfer ownership to the code generator.
> > > -      CodeGen.setModule(Module.release());
> > > -    } else if (!CodeGen.addModule(Module.get()))
> > > -      return 1;
> > > -
> > > -    unsigned NumSyms = LTOMod->getSymbolCount();
> > > +    unsigned NumSyms = Module->getSymbolCount();
> > >      for (unsigned I = 0; I < NumSyms; ++I) {
> > > -      StringRef Name = LTOMod->getSymbolName(I);
> > > +      StringRef Name = Module->getSymbolName(I);
> > >        if (!DSOSymbolsSet.count(Name))
> > >          continue;
> > > -      lto_symbol_attributes Attrs = LTOMod->getSymbolAttributes(I);
> > > +      lto_symbol_attributes Attrs = Module->getSymbolAttributes(I);
> > >        unsigned Scope = Attrs & LTO_SYMBOL_SCOPE_MASK;
> > >        if (Scope != LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN)
> > >          KeptDSOSyms.push_back(Name);
> > >      }
> > > +
> > > +    // We use the first input module as the destination module when
> > > +    // SetMergedModule is true.
> > > +    if (SetMergedModule && i == BaseArg) {
> > > +      // Transfer ownership to the code generator.
> > > +      CodeGen.setModule(std::move(Module));
> > > +    } else if (!CodeGen.addModule(Module.get()))
> > > +      return 1;
> >
> > Why have you moved this code?
> 
> This was needed because of the ownership change -- see below.
> 
> > >  LTOCodeGenerator::~LTOCodeGenerator() {
> > > -  destroyMergedModule();
> > > -
> >
> > Could the ordering between tearing down the `LTOModule` and the
> > `TargetMach` matter?
> 
> From a quick scan I couldn't see anything inside any derived class of
> TargetMachine that preserves a reference to anything inside the Module, so
> I don't think it should matter.
> 
> >  I suggest moving `TargetMach` to a
> > `std::unique_ptr<>` first in a separate commit (looks like it would be
> > easy?), and placing it such that the ordering won't change.
> 
> r245670. (Note that this wouldn't change the order anyway, as the members
> would be destroyed after the destructor is called.)
> 
> > Seems like you could through the CodegenOptions into a unique_ptr as
> > well (using a custom deleter that calls `free()`).  Might be nice to
> > clean that up as well (also in a separate commit).
> 
> Or I could use std::vector<std::string> and build the char * vector on demand,
> which seems like the simplest approach. r245671.
> 
> > > +
> > > +  // Finally destroy the module as we no longer need it
> > > + OwnedModule.reset();
> >
> > Yes we do!  API users can call `lto_codegen_write_merged_modules()`
> > after this.
> 
> So they can. Hrm.
> 
> So the problem I have is that in order to implement parallel LTO code
> generation efficiently, I would like all code generation in this class (and the
> gold plugin) to go through a function whose signature is something like
> 
> void codegen(std::unique_ptr<Module> M, std::vector<ostream> outputs, x
> foo, y bar, z baz, ...);
> 
> so that the codegen function is free to mangle M arbitrarily then destroy it.
> (This would be useful if, say, the work is partitioned among multiple
> modules; M could be reused by stripping it of all but the contents of a single
> partition.) This would prevent clients from calling
> lto_codegen_write_merged_modules() and getting what they want even if
> we did change the function to not take ownership of the module.
> 
> I suppose that I could make this function's signature look like this:
> 
> std::unique_ptr<Module> codegen(std::unique_ptr<Module> M,
> std::vector<ostream> outputs, x foo, y bar, z baz, ...);
> 
> where the result is equal to M if the parallelism level is 1, or
> std::unique_ptr<Module>() otherwise. The code in LTOCodeGenerator
> would then look like this:
> 
> OwnedModule = codegen(std::move(OwnedModule), ...);
> 
> This would mean that clients would need to change to doing something like
> this if they want to save IR before parallel CG:
> 
> lto_codegen_set_module();
> lto_codegen_add_module();
> lto_codegen_add_module();
> lto_codegen_optimize();
> lto_codegen_write_merged_modules();
> lto_codegen_compile_optimized();
> 
> but the semantics for non-parallel CG are preserved.
> 
> >
> > >
> > >    return true;
> > >  }
> > > Index: include/llvm/Linker/Linker.h
> > >
> ==========================================================
> =========
> > > --- include/llvm/Linker/Linker.h
> > > +++ include/llvm/Linker/Linker.h
> > > @@ -62,7 +62,6 @@
> > >
> > >    Linker(Module *M, DiagnosticHandlerFunction DiagnosticHandler);
> > >    Linker(Module *M);
> > > -  ~Linker();
> >
> > This seems unrelated?
> 
> r245672.
> 
> > > --- include/llvm-c/lto.h
> > > +++ include/llvm-c/lto.h
> > > @@ -40,7 +40,7 @@
> > >   * @{
> > >   */
> > >
> > > -#define LTO_API_VERSION 17
> > > +#define LTO_API_VERSION 18
> >
> > There's no new API, so the version shouldn't be changing.
> >
> > >
> > >  /**
> > >   * \since prior to LTO_API_VERSION=3 @@ -374,8 +374,7 @@
> > > lto_codegen_add_module(lto_code_gen_t cg, lto_module_t mod);
> > >
> > >  /**
> > > - * Sets the object module for code generation. This will transfer
> > > the ownship of
> > > - * the module to code generator.
> > > + * Sets the object module for code generation. This will destroy the
> module.
> >
> > Was this poorly documented, or is your patch intended to change the
> > semantics of this call?  The latter would be a problem.
> 
> The latter. How important is it that we preserve these semantics? It looks like
> this function was introduced by Manman in r230290, so presumably this is
> being used by some version of ld64, but I cannot see any uses of this function
> in ld64-242.
> 
> How difficult would it be to change ld64 to not rely on the module staying
> around? There would be no real ownership changes as such, so I think all you
> would need to do is make sure that the linker does not touch the module
> after calling lto_codegen_set_module.
> 
> The only other linker I am aware of that uses libLTO is Sony's. Yunzhong,
> would this change affect you significantly?
> 
> There are other things we can do if we absolutely have to preseve these
> semantics, but it would add complexity to LTOCodeGenerator.
> 
> Thanks,
> --
> Peter



More information about the llvm-commits mailing list