[LTO API] add lto_codegen_set_module to help ld64 debug LTO issues

Mon Feb 23 15:10:28 PST 2015

Updated patch is attached, it should address the review comments.

Cheers,
Manman

> On Feb 22, 2015, at 8:53 PM, Manman Ren <mren at apple.com> wrote:
> 
>> 
>> On Feb 20, 2015, at 12:05 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com <mailto:dexonsmith at apple.com>> wrote:
>> 
>>> 
>>> On 2015-Feb-20, at 11:41, Rafael Espíndola <rafael.espindola at gmail.com <mailto:rafael.espindola at gmail.com>> wrote:
>>> 
>>> Well, the impact of the patch is pretty small. So if you and Duncan
>>> think it is helpful I am OK with it.
>> 
>> This SGTM.
>> 
>>> Please just add an option to
>>> llvm-lto and a test using it.
> Will do.
>> 
>> +1; more comments below.
>> 
>>> 
>>> On 20 February 2015 at 13:50, Manman Ren <mren at apple.com <mailto:mren at apple.com>> wrote:
>>>> HI Rafael,
>>>> 
>>>> Yes, ld64 will have an option for this, with this option, ld64 will skip the IR passes as well. "llc combined.bc" will help us debug the backend passes, but we sometimes want to directly debug ld64 source codes (things in ld64 that run after backend passes).
>>>> It is much faster to debug “ld64 combined.bc” without IR passes than to debug “ld64 a.o b.o …”.
>>>> 
>>>> Thanks,
>>>> Manman
>>>> 
>>>>> On Feb 19, 2015, at 6:06 PM, Rafael Espíndola <rafael.espindola at gmail.com <mailto:rafael.espindola at gmail.com>> wrote:
>>>>> 
>>>>> How would this be used? Would ld64 get an option for it? How would
>>>>> 
>>>>> ld64 -no_optimize_single_ir combined.bc ...
>>>>> be different from
>>>>> llc combined.bc -o foo.o -filetype=obj
>>>>> ld64 foo.o ....
>>>>> 
>>>>> Cheers,
>>>>> Rafael
>>>>> 
>>>>> 
>>>>> On 19 February 2015 at 20:22, Manman Ren <mren at apple.com <mailto:mren at apple.com>> wrote:
>>>>>> Hi Rafael and others,
>>>>>> 
>>>>>> When debugging LTO issues with ld64, I sometimes use -save-temps to save the merged optimized bitcode file, then invoke ld64 again on the single bitcode file.
>>>>>> 
>>>>>> I found out that llvm linking a single bitcode file via lto_codegen_add_module will generate a different bitcode file from the single input.
>>>>>> This makes debugging hard. With the newly-added lto_codegen_set_module, we can make sure the destination module is the same as the input.
>>>>>> 
>>>>>> The patch is attached. Let me know your thoughts,
>>>>>> 
>>>>>> Thanks,
>>>>>> Manman
>> 
>> 
>>> diff --git a/include/llvm-c/lto.h b/include/llvm-c/lto.h
>>> index 1fe0cd5..403887b 100644
>>> --- a/include/llvm-c/lto.h
>>> +++ b/include/llvm-c/lto.h
>>> @@ -40,7 +40,7 @@ typedef bool lto_bool_t;
>>>  * @{
>>>  */
>>> 
>>> -#define LTO_API_VERSION 12
>>> +#define LTO_API_VERSION 13
>>> 
>>> /**
>>>  * \since prior to LTO_API_VERSION=3
>>> @@ -396,6 +396,14 @@ extern lto_bool_t
>>> lto_codegen_add_module(lto_code_gen_t cg, lto_module_t mod);
>>> 
>>> /**
>>> + * Sets the object module for code generation.
>>> + *
>>> + * \since prior to LTO_API_VERSION=13
>>> + */
>>> +extern void
>>> +lto_codegen_set_module(lto_code_gen_t cg, lto_module_t mod);
>> 
>> Does this cause a problem for runtime detection of the availability
>> of the function?
>> 
>> (Or will ld64 just use the recently minted `lto_api_version()`?)
> 
> That is my plan, using lto_api_version().
> 
>> 
>> Also, is this the right name?  I wonder if
>> `lto_codegen_replace_merged_modules()` would be more clear.
> 
> My initial usage for lto_codegen_set_module is to set the initial and the final merged module. But I am not sure if we should relax that.
> Actually we already created a merged module when creating LTOCodeGenerator. So we should dispose the old merged module.
> 
>> 
>>> +
>>> +/**
>>>  * Sets if debug info should be generated.
>>>  * Returns true on error (check lto_get_error_message() for details).
>>>  *
>>> diff --git a/include/llvm/LTO/LTOCodeGenerator.h b/include/llvm/LTO/LTOCodeGenerator.h
>>> index ab307bf..7015149 100644
>>> --- a/include/llvm/LTO/LTOCodeGenerator.h
>>> +++ b/include/llvm/LTO/LTOCodeGenerator.h
>>> @@ -66,6 +66,8 @@ struct LTOCodeGenerator {
>>> 
>>>   // Merge given module, return true on success.
>>>   bool addModule(struct LTOModule *);
>>> +  // Set the destination module.
>>> +  void setModule(struct LTOModule *);
>>> 
>>>   void setTargetOptions(TargetOptions options);
>>>   void setDebugInfo(lto_debug_model);
>>> diff --git a/include/llvm/Linker/Linker.h b/include/llvm/Linker/Linker.h
>>> index 9c3ecea..31cdc36 100644
>>> --- a/include/llvm/Linker/Linker.h
>>> +++ b/include/llvm/Linker/Linker.h
>>> @@ -69,6 +69,9 @@ public:
>>>   /// \brief Link \p Src into the composite. The source is destroyed.
>>>   /// Returns true on error.
>>>   bool linkInModule(Module *Src);
>> 
>> A newline here might be nice.
> Will do.
>> 
>>> +  /// \breif Set the composite to the passed-in module.
>> 
>> s/breif/brief/
> Will do.
>> 
>>> +  /// Returns true on error.
>>> +  void setModule(Module *Dst);
>>> 
>>>   static bool LinkModules(Module *Dest, Module *Src,
>>>                           DiagnosticHandlerFunction DiagnosticHandler);
>>> diff --git a/lib/LTO/LTOCodeGenerator.cpp b/lib/LTO/LTOCodeGenerator.cpp
>>> index f37d0a9..cfe18b9 100644
>>> --- a/lib/LTO/LTOCodeGenerator.cpp
>>> +++ b/lib/LTO/LTOCodeGenerator.cpp
>>> @@ -141,6 +141,17 @@ bool LTOCodeGenerator::addModule(LTOModule *mod) {
>>>   return !ret;
>>> }
>>> 
>>> +void LTOCodeGenerator::setModule(LTOModule *Mod) {
>>> +  assert(&Mod->getModule().getContext() == &Context &&
>>> +         "Expected module in same context");
>> 
>> This should be documented in the header.
>> 
>> Is it required that `lto_codegen_add_module()` hasn't been called?
>> I'm not sure what the right answer is, but if so you should
>> `assert()` to that effect here (and update the header docs).
> 
> My initial usage of lto_codegen_set_module is to call it before any lto_codegen_add_module has been called.
> But I am not sure if we should relax that.
> 
>> 
>>> +
>>> +  IRLinker.setModule(&Mod->getModule());
>> 
>> What are the ownership semantics here?  In particular,
>> `LTOCodeGenerator::~LTOCodeGenerator()` calls `IRLinker.deleteModule()`.
>> Is this still valid?
> 
> lto_codegen_set_module will transfer the ownership to the linker.
> 
>> 
>> Should outside callers still call `lto_module_dispose()`?
> Then outside callers will not call lto_module_dispose.
> 
>> 
>> Assuming the answers are "yes" and "no", then this makes sense to me.
>> But you should document it better in the header ("transfers ownership of
>> the module ...”).
> Yes, I should update the header docs!
>> 
>> (BTW, does this leak the old merged module?  Should it be deleted?)
> Yes, we should dispose the old merged module.
> 
>> 
>>> +
>>> +  const std::vector<const char*> &Undefs = Mod->getAsmUndefinedRefs();
>>> +  for (int I = 0, E = Undefs.size(); I != E; ++I)
>>> +    AsmUndefinedRefs[Undefs[I]] = 1;
>> 
>> Should you call `AsmUndefinedRefs.clear()` first?
> Yes if we have called add_module before set_module.
>> 
>>> +}
>>> +
>>> void LTOCodeGenerator::setTargetOptions(TargetOptions options) {
>>>   Options = options;
>>> }
>>> diff --git a/lib/Linker/LinkModules.cpp b/lib/Linker/LinkModules.cpp
>>> index dc002d8..e9d8daf 100644
>>> --- a/lib/Linker/LinkModules.cpp
>>> +++ b/lib/Linker/LinkModules.cpp
>>> @@ -1726,6 +1726,10 @@ bool Linker::linkInModule(Module *Src) {
>>>   return RetCode;
>>> }
>>> 
>>> +void Linker::setModule(Module *Dst) {
>>> +  Composite = Dst;
>> 
>> In `LTOCodeGenerator::setModule()` you had to reset `AsmUndefinedRefs`.
>> Is there anything that needs to be reset/updated here?
> If we are setting the final merged module with lto_codegen_set_module, then we don’t need to do anything else.
> But if we relax that, we should call part of “Linker::init()”
>   TypeFinder StructTypes;
>   StructTypes.run(*M, true);
>   for (StructType *Ty : StructTypes) {
>     if (Ty->isOpaque())
>       IdentifiedStructTypes.addOpaque(Ty);
>     else
>       IdentifiedStructTypes.addNonOpaque(Ty);
>   }
> 
> Thanks for reviewing!
> Manman
> 
>> 
>>> +}
>>> +
>>> //===----------------------------------------------------------------------===//
>>> // LinkModules entrypoint.
>>> //===----------------------------------------------------------------------===//
>>> diff --git a/tools/lto/lto.cpp b/tools/lto/lto.cpp
>>> index 3a33652..714b8e0 100644
>>> --- a/tools/lto/lto.cpp
>>> +++ b/tools/lto/lto.cpp
>>> @@ -248,6 +248,10 @@ bool lto_codegen_add_module(lto_code_gen_t cg, lto_module_t mod) {
>>>   return !unwrap(cg)->addModule(unwrap(mod));
>>> }
>>> 
>>> +void lto_codegen_set_module(lto_code_gen_t cg, lto_module_t mod) {
>>> +  unwrap(cg)->setModule(unwrap(mod));
>>> +}
>>> +
>>> bool lto_codegen_set_debug_model(lto_code_gen_t cg, lto_debug_model debug) {
>>>   unwrap(cg)->setDebugInfo(debug);
>>>   return false;
>>> diff --git a/tools/lto/lto.exports b/tools/lto/lto.exports
>>> index 9ccf88a..9de2d19 100644
>>> --- a/tools/lto/lto.exports
>>> +++ b/tools/lto/lto.exports
>>> @@ -25,6 +25,7 @@ lto_module_dispose
>>> lto_api_version
>>> lto_codegen_set_diagnostic_handler
>>> lto_codegen_add_module
>>> +lto_codegen_set_module
>>> lto_codegen_add_must_preserve_symbol
>>> lto_codegen_compile
>>> lto_codegen_create

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150223/50d82281/attachment.html>