[PATCH] [ELF] LLD does not create a record in the .dynsym if a strong symbol defined in the executable file replaces a weak symbol from a shared library.

Nick Kledzik kledzik at apple.com
Thu Aug 14 12:08:13 PDT 2014


On Aug 14, 2014, at 6:18 AM, Simon Atanasyan <simon at atanasyan.com> wrote:

> On Thu, Aug 14, 2014 at 4:51 AM, kledzik at apple.com <kledzik at apple.com> wrote:
>> From the description of the problem, I thought that libfoo would have a weak definition of "flag" and main had a strong definition of "flag" and the problem was that the binaries were not set up properly so that ld.so would pick main's flag to override libfoo's flag.
>> 
>> But looking at the example code,  "flag" is extern.  So this seems like the other meaning of weak - that the symbol can be missing at runtime.  But the use of "flag" in foo() is not checking if its address is NULL before accessing. But in this case it should not be NULL because ld.so should find it in main.
>> 
>> Which case is this?
> 
> The original problem is related to the _IO_stdin_used symbol. It is
> defined as a weak symbol in the libc.so and as a strong symbol in the
> crt1.o. The code in libc checks its address and if the address is not
> null uses an appropriate variant of IO routines.
> 
> Everything works correctly if the program linked by GNU linker. This
> linker understand that a weak symbol from a shared library is
> coalesced away by a strong symbol defined in an object file and put a
> corresponding record in a dynamic symbol table. If you use LLD it does
> not put such symbol in the dynamic symbol table so at runtime from the
> shared library point of view this symbol remains weak.

Your description here is still ambiguous.  Is flag/_IO_stdin_ defined or undefined in libc.so?

If undefined, then yes, libc.so could check if its address is NULL or not. But the coalescing in the linker would be between a DefinedAtom and an UndefinedAtom - not strong overriding a weak. 

If defined, then libc.so would need to check if the address is of the (weak copy) inside libc.so or is the address of the one in the main program (from crt1.o).  But then the Resolver would be faced with a DefinedAtom and a SharedLibraryAtom, and the SharedLibraryAtom does not have a way to express that the symbol was a weak definition. And we need that info before we can mark the DefinedAtom as having overridden something and therefore needing to be in the dynsym.

> 
>> What info is missing after resolve() is done?
> 
> When resolve() is done we have a "defined" (strong) symbol and forget
> that a weak symbol from a shared library was coalesced away. We
> consider the strong symbols as a common "defined" symbol which does
> not require any records in the dynamic symbol table.


Mach-O also needs support for the defined case.  I think we should enhance the SharedLibraryAtom to have a mergeAsWeak() method to reflect if the symbol is a weak definition in the shared library.

But in either case, we need a way for the Resolver to mark that the DefinedAtom collided with a canBeNullAtRuntime UndefinedAtom or a mergeAsWeak SharedLibraryAtom. There is already a dynamicExport() method on DefinedAtom.  It would be nice to tie into that.  The problem is that the Resolver always works with "const DefinedAtom”, so there are no “set” methods.

Shankar’s idea of adding a new Reference which the Resolver can tweak is way to side step the const problem.  But it is a little clunky in that the Resolver needs to know how to find this magic Reference and change it (which still needs a const_cast), and  dynamicExport() needs to be reworked to search for the magic Reference.

My suggestion is to add a new method to LinkingContext like:

	void validatedCoalesce(const Atom &existingAtom, const Atom &newAtom, bool &useNew);

Which is called at the end of SymbolTable::addByName() to give the platform specific code a chance to either change which atom will be kept, or record side information.  In your case ELFLinkingContext could use this to maintain a set<const DefinedAtom*> which need to be added to the dynsym table.


-Nick









More information about the llvm-commits mailing list