[llvm] r185882 - Don't run internalize if we're outputing bit-code and not an object file.

Tue Jul 9 00:11:03 PDT 2013

On Jul 8, 2013, at 5:42 PM, Reid Kleckner <rnk at google.com> wrote:

> On Mon, Jul 8, 2013 at 8:10 PM, Bill Wendling <isanbard at gmail.com> wrote:
> On Jul 8, 2013, at 4:54 PM, Chris Lattner <clattner at apple.com> wrote:
> 
> > On Jul 8, 2013, at 4:23 PM, Bill Wendling <isanbard at gmail.com> wrote:
> >
> >> Author: void
> >> Date: Mon Jul  8 18:23:03 2013
> >> New Revision: 185882
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=185882&view=rev
> >> Log:
> >> Don't run internalize if we're outputing bit-code and not an object file.
> >>
> >> The problem with running internalize before we're ready to output an object file
> >> is that it may change a 'weak' symbol into an internal one, but that symbol
> >> could be needed by an external object file --- e.g. with arclite.
> >
> > I don't understand: how does output format affect symbol visibility?
> >
> If it changes a symbol from 'weak' to 'non-weak', then that symbol may be optimized away if the compiler thinks that it's no longer used. But if it's running 'ld -r', it doesn't know if the symbol can be optimized away because it doesn't have all of the source code available for it. Given this, it doesn't really make sense to run internalize before outputting a .bc file. That code hasn't gone through code generation yet, so the linker doesn't really know which of its symbols can be changed from 'weak' to 'non-weak'.
> 
> ld -r seems really, really different from normal LTO. Is there some other flag to detect that?

I'm not sure what you mean by an other flag. The LTO library shouldn't know that the linker is using '-r' or some other flag. If we step back a bit from the output format, this change is making the call conform to the documentation:

/// lto_codegen_write_merged_modules - Writes a new file at the specified path                                                                                                                  
/// that contains the merged contents of all modules added so far. Returns true                                                                                                                  
/// on error (check lto_get_error_message() for details).                                                                                                                                        
bool lto_codegen_write_merged_modules(lto_code_gen_t cg, const char *path) {
  return cg->writeMergedModules(path, sLastErrorString);
}

I would argue that running 'internalize' before writing the merged modules is against what the documentation is saying here (or at least against what I would expect given this comment).

> Are there any other passes that LTO uses that assume it has all the code in the final image? 

Not in the same way that the internalize pass assumes that knowledge. The internalize pass is, of course, changing the visibility and sometimes the linkage of objects. Once that's done, then the other passes can rely upon the visibility and linkage to do the "correct thing". In other words, I don't think that we need to restrict other passes because of the 'ld -r' usage...

Does that answer your question? :-)

-bw

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130709/4d9f6c32/attachment.html>