[llvm] r241721 - Start adding support for writing archives in BSD format.

Kevin Enderby enderby at apple.com
Fri Jul 10 11:24:41 PDT 2015


> On Jul 10, 2015, at 8:30 AM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> 
>> The thinking some 20 years ago is we wanted to be sure a “make clean” and a “make all” would generally work in this case.  Cases where ar(1) was used to replace individual .o files in a fat file when possibly presented with a new .o file to replace that did not have the same number of architectures was deemed too hard to define the semantics.
>> 
>> So we moved to the darwin libtool(1) that takes any .o files, archives, fat files and builds the proper static library only from scratch every time.  That is there is no ar(1) functionality like replacing an archive member etc. after the library is built.
> 
> On current build systems it seems common to build the archive from
> scratch. That is what llvm-ar is optimized to do, so when it is time
> to add support for universal archives, llvm-ar should probably take
> care of building a fat file with N complete archives in it.

I agree.  I know of no system that tries to build .o files an update archives in place without rebuilding the whole archive any more.  But long ago (like 30+ years) that was common and even make(1) supported that directly for compiling a .o file and replacing it in the archive.  I think it was to save disk space to not have a copy of the .o file on disk and in the archive.

> A quick benchmark comparing llvm-ar X ar+ranlib building
> libclangSema.a on OS X on a laptop:
> 
> llvm-ar: 0.02 seconds
> ar: 0.07 seconds
> ranlib: 0.18 seconds.
> 
>>> If so I guess someday we might see a fat thin archive :-)
>> 
>> Yes, in with the definition of a "thin archive” only containing a symbol table, aka table of contents.  Where each architecture slice contains a different “thin archive”.  Maybe a term of “stub archive” could be used to not over load the word “thin archive” in this case?
> 
> Could be. Probably my favourite would be universal thin archive.
> 
>>> Another question, why is "__.SYMDEF SORTED" not supported when multiple object files define the same symbol?
>> 
>> Because of the old UNIX ld(1) semantics of searching archives for symbols when they encountered in the link line from each archive in the order of their table of contents.  If there are no duplicate symbols in the objects in an archive then the order of the table of contents does not mater, and we can sort the symbols.  If there are, to get the exact old semantics one must produce the table of contents in archive member order.  These semantics go back to the version 7 UNIX and BSD 4.2 days when I started writing linkers.  Today lots of these static archive semantics don’t apply in the world of dynamic libraries.
> 
> I see. I was then going to ask if newer version of ld were able to
> maintain the old semantics if given a "__.SYMDEF SORTED". That seems
> unlikely since the original order lost and there is no way to figure
> out which member should be included.

Yes it is not possible to maintain the old old semantics if given a "__.SYMDEF SORTED” table of contents when there are duplicate archive members that define the same symbol and you need to have the original order.  That is why the darwin ranlib(1) would fall back to the old "__.SYMDEF” table of contents when it found duplicated.

All of this is extremely old behavior and even use of static libraries is old behavior as we have moved on to dynamic libraries and different semantics.  I don’t think anyone would care if the old behavior was dropped in the llvm tools.

> 
> I decided to benchmark it a bit. I did two builds of clang. The only
> difference is that I used llvm-ar in one and ar+ranlib on the other.
> Of the 154 libraries, 74 have "__.SYMDEF SORTED" when using ar+ranlib.
> 
> Linking clang with ld64 (ld64-242.2) took 1.1865 seconds with ar+ralib
> created libraries and 1.199 seconds with llvm-ar created libraries.
> That seems to be in the noise. Is there a case where "__.SYMDEF
> SORTED" is still know to make a difference?

I don’t think so.  Way back when it did matter but machines and disks have gotten so much faster that I doubt any difference can be measured.  So again if you feel things in the llvm world can be simplified by dropping the "__.SYMDEF SORTED” table of contents I don’t think anyone would have a problem with it.

> 
> Cheers,
> Rafael





More information about the llvm-commits mailing list