[LTO API] add lto_codegen_set_module to help ld64 debug LTO issues

Duncan P. N. Exon Smith dexonsmith at apple.com
Fri Feb 20 12:05:55 PST 2015


> On 2015-Feb-20, at 11:41, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> 
> Well, the impact of the patch is pretty small. So if you and Duncan
> think it is helpful I am OK with it.

This SGTM.

> Please just add an option to
> llvm-lto and a test using it.

+1; more comments below.

> 
> On 20 February 2015 at 13:50, Manman Ren <mren at apple.com> wrote:
>> HI Rafael,
>> 
>> Yes, ld64 will have an option for this, with this option, ld64 will skip the IR passes as well. "llc combined.bc" will help us debug the backend passes, but we sometimes want to directly debug ld64 source codes (things in ld64 that run after backend passes).
>> It is much faster to debug “ld64 combined.bc” without IR passes than to debug “ld64 a.o b.o …”.
>> 
>> Thanks,
>> Manman
>> 
>>> On Feb 19, 2015, at 6:06 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
>>> 
>>> How would this be used? Would ld64 get an option for it? How would
>>> 
>>> ld64 -no_optimize_single_ir combined.bc ...
>>> be different from
>>> llc combined.bc -o foo.o -filetype=obj
>>> ld64 foo.o ....
>>> 
>>> Cheers,
>>> Rafael
>>> 
>>> 
>>> On 19 February 2015 at 20:22, Manman Ren <mren at apple.com> wrote:
>>>> Hi Rafael and others,
>>>> 
>>>> When debugging LTO issues with ld64, I sometimes use -save-temps to save the merged optimized bitcode file, then invoke ld64 again on the single bitcode file.
>>>> 
>>>> I found out that llvm linking a single bitcode file via lto_codegen_add_module will generate a different bitcode file from the single input.
>>>> This makes debugging hard. With the newly-added lto_codegen_set_module, we can make sure the destination module is the same as the input.
>>>> 
>>>> The patch is attached. Let me know your thoughts,
>>>> 
>>>> Thanks,
>>>> Manman


> diff --git a/include/llvm-c/lto.h b/include/llvm-c/lto.h
> index 1fe0cd5..403887b 100644
> --- a/include/llvm-c/lto.h
> +++ b/include/llvm-c/lto.h
> @@ -40,7 +40,7 @@ typedef bool lto_bool_t;
>   * @{
>   */
>  
> -#define LTO_API_VERSION 12
> +#define LTO_API_VERSION 13
>  
>  /**
>   * \since prior to LTO_API_VERSION=3
> @@ -396,6 +396,14 @@ extern lto_bool_t
>  lto_codegen_add_module(lto_code_gen_t cg, lto_module_t mod);
>  
>  /**
> + * Sets the object module for code generation.
> + *
> + * \since prior to LTO_API_VERSION=13
> + */
> +extern void
> +lto_codegen_set_module(lto_code_gen_t cg, lto_module_t mod);

Does this cause a problem for runtime detection of the availability
of the function?

(Or will ld64 just use the recently minted `lto_api_version()`?)

Also, is this the right name?  I wonder if
`lto_codegen_replace_merged_modules()` would be more clear.

> +
> +/**
>   * Sets if debug info should be generated.
>   * Returns true on error (check lto_get_error_message() for details).
>   *
> diff --git a/include/llvm/LTO/LTOCodeGenerator.h b/include/llvm/LTO/LTOCodeGenerator.h
> index ab307bf..7015149 100644
> --- a/include/llvm/LTO/LTOCodeGenerator.h
> +++ b/include/llvm/LTO/LTOCodeGenerator.h
> @@ -66,6 +66,8 @@ struct LTOCodeGenerator {
>  
>    // Merge given module, return true on success.
>    bool addModule(struct LTOModule *);
> +  // Set the destination module.
> +  void setModule(struct LTOModule *);
>  
>    void setTargetOptions(TargetOptions options);
>    void setDebugInfo(lto_debug_model);
> diff --git a/include/llvm/Linker/Linker.h b/include/llvm/Linker/Linker.h
> index 9c3ecea..31cdc36 100644
> --- a/include/llvm/Linker/Linker.h
> +++ b/include/llvm/Linker/Linker.h
> @@ -69,6 +69,9 @@ public:
>    /// \brief Link \p Src into the composite. The source is destroyed.
>    /// Returns true on error.
>    bool linkInModule(Module *Src);

A newline here might be nice.

> +  /// \breif Set the composite to the passed-in module.

s/breif/brief/

> +  /// Returns true on error.
> +  void setModule(Module *Dst);
>  
>    static bool LinkModules(Module *Dest, Module *Src,
>                            DiagnosticHandlerFunction DiagnosticHandler);
> diff --git a/lib/LTO/LTOCodeGenerator.cpp b/lib/LTO/LTOCodeGenerator.cpp
> index f37d0a9..cfe18b9 100644
> --- a/lib/LTO/LTOCodeGenerator.cpp
> +++ b/lib/LTO/LTOCodeGenerator.cpp
> @@ -141,6 +141,17 @@ bool LTOCodeGenerator::addModule(LTOModule *mod) {
>    return !ret;
>  }
>  
> +void LTOCodeGenerator::setModule(LTOModule *Mod) {
> +  assert(&Mod->getModule().getContext() == &Context &&
> +         "Expected module in same context");

This should be documented in the header.

Is it required that `lto_codegen_add_module()` hasn't been called?
I'm not sure what the right answer is, but if so you should
`assert()` to that effect here (and update the header docs).

> +
> +  IRLinker.setModule(&Mod->getModule());

What are the ownership semantics here?  In particular,
`LTOCodeGenerator::~LTOCodeGenerator()` calls `IRLinker.deleteModule()`.
Is this still valid?

Should outside callers still call `lto_module_dispose()`?

Assuming the answers are "yes" and "no", then this makes sense to me.
But you should document it better in the header ("transfers ownership of
the module ...").

(BTW, does this leak the old merged module?  Should it be deleted?)

> +
> +  const std::vector<const char*> &Undefs = Mod->getAsmUndefinedRefs();
> +  for (int I = 0, E = Undefs.size(); I != E; ++I)
> +    AsmUndefinedRefs[Undefs[I]] = 1;

Should you call `AsmUndefinedRefs.clear()` first?

> +}
> +
>  void LTOCodeGenerator::setTargetOptions(TargetOptions options) {
>    Options = options;
>  }
> diff --git a/lib/Linker/LinkModules.cpp b/lib/Linker/LinkModules.cpp
> index dc002d8..e9d8daf 100644
> --- a/lib/Linker/LinkModules.cpp
> +++ b/lib/Linker/LinkModules.cpp
> @@ -1726,6 +1726,10 @@ bool Linker::linkInModule(Module *Src) {
>    return RetCode;
>  }
>  
> +void Linker::setModule(Module *Dst) {
> +  Composite = Dst;

In `LTOCodeGenerator::setModule()` you had to reset `AsmUndefinedRefs`.
Is there anything that needs to be reset/updated here?

> +}
> +
>  //===----------------------------------------------------------------------===//
>  // LinkModules entrypoint.
>  //===----------------------------------------------------------------------===//
> diff --git a/tools/lto/lto.cpp b/tools/lto/lto.cpp
> index 3a33652..714b8e0 100644
> --- a/tools/lto/lto.cpp
> +++ b/tools/lto/lto.cpp
> @@ -248,6 +248,10 @@ bool lto_codegen_add_module(lto_code_gen_t cg, lto_module_t mod) {
>    return !unwrap(cg)->addModule(unwrap(mod));
>  }
>  
> +void lto_codegen_set_module(lto_code_gen_t cg, lto_module_t mod) {
> +  unwrap(cg)->setModule(unwrap(mod));
> +}
> +
>  bool lto_codegen_set_debug_model(lto_code_gen_t cg, lto_debug_model debug) {
>    unwrap(cg)->setDebugInfo(debug);
>    return false;
> diff --git a/tools/lto/lto.exports b/tools/lto/lto.exports
> index 9ccf88a..9de2d19 100644
> --- a/tools/lto/lto.exports
> +++ b/tools/lto/lto.exports
> @@ -25,6 +25,7 @@ lto_module_dispose
>  lto_api_version
>  lto_codegen_set_diagnostic_handler
>  lto_codegen_add_module
> +lto_codegen_set_module
>  lto_codegen_add_must_preserve_symbol
>  lto_codegen_compile
>  lto_codegen_create
> 





More information about the llvm-commits mailing list