[PATCH] [ELF] LLD does not create a record in the .dynsym if a strong symbol defined in the executable file replaces a weak symbol from a shared library.

Fri Aug 15 10:56:54 PDT 2014

On Thu, Aug 14, 2014 at 11:08 PM, Nick Kledzik <kledzik at apple.com> wrote:
>
> On Aug 14, 2014, at 6:18 AM, Simon Atanasyan <simon at atanasyan.com> wrote:
>> The original problem is related to the _IO_stdin_used symbol. It is
>> defined as a weak symbol in the libc.so and as a strong symbol in the
>> crt1.o. The code in libc checks its address and if the address is not
>> null uses an appropriate variant of IO routines.
>>
>> Everything works correctly if the program linked by GNU linker. This
>> linker understand that a weak symbol from a shared library is
>> coalesced away by a strong symbol defined in an object file and put a
>> corresponding record in a dynamic symbol table. If you use LLD it does
>> not put such symbol in the dynamic symbol table so at runtime from the
>> shared library point of view this symbol remains weak.
>
> Your description here is still ambiguous. Is flag/_IO_stdin_ defined or
> undefined in libc.so?
>
> If undefined, then yes, libc.so could check if its address is NULL or not.
> But the coalescing in the linker would be between a DefinedAtom and
> an UndefinedAtom - not strong overriding a weak.

The _IO_stdin_ is undefined in the libc "NOTYPE WEAK DEFAULT UND
_IO_stdin_used".
And libc checks its address but it is always null and libc selects the
wrong way.

>>> What info is missing after resolve() is done?
>>
>> When resolve() is done we have a "defined" (strong) symbol and forget
>> that a weak symbol from a shared library was coalesced away. We
>> consider the strong symbols as a common "defined" symbol which does
>> not require any records in the dynamic symbol table.
>
> Mach-O also needs support for the defined case. I think we should enhance
> the SharedLibraryAtom to have a mergeAsWeak() method to reflect if the symbol
> is a weak definition in the shared library.
>
> But in either case, we need a way for the Resolver to mark that the DefinedAtom
> collided with a canBeNullAtRuntime UndefinedAtom or a mergeAsWeak SharedLibraryAtom.
> There is already a dynamicExport() method on DefinedAtom. It would be nice to tie
> into that. The problem is that the Resolver always works with "const DefinedAtom”,
> so there are no “set” methods.

That was my original idea but I bumped into the "const" problem too.

> Shankar’s idea of adding a new Reference which the Resolver can tweak is way to side
> step the const problem.  But it is a little clunky in that the Resolver needs to know
> how to find this magic Reference and change it (which still needs a const_cast), and
> dynamicExport() needs to be reworked to search for the magic Reference.
>
> My suggestion is to add a new method to LinkingContext like:
>
>   void validatedCoalesce(const Atom &existingAtom, const Atom &newAtom, bool &useNew);
>
> Which is called at the end of SymbolTable::addByName() to give the platform specific
> code a chance to either change which atom will be kept, or record side information. In your case
> ELFLinkingContext could use this to maintain a set<const DefinedAtom*> which need
> to be added to the dynsym table.

I like this idea but I am not sure that this approach can be used with
RoundTripYAMLPass
and RoundTripNativePass. As far as I understand references (suggested
by Shankar) correctly
survive YAML/Native conversions. But if we somehow store pointers to atoms in
the validatedCoalesce() method these pointers will not be valid after
the RoundTripNativePass finished.

-- 
Simon Atanasyan