[llvm-dev] Weak undefined symbols and dynamic libraries

Peter Smith via llvm-dev llvm-dev at lists.llvm.org
Sun Oct 15 23:03:10 PDT 2017


Unless my jet-lagged brain has misunderstood I think the libc crt.o
module [*] that is included in non-pic Linux ELF executables has a
platform specific mechanism of evaluating whether a weak reference to
the library function  "__gmon_start__ " exists before calling it. In
essence it checks the GOT entry for the function and not address of
the function, which as you point out could be the PLT entry for the
function, which will be non 0.

Ideally I think we would want for each undefined weak reference
- If we are dynamic linking create a PLT and GOT entry for each PLTGOT
generating relocation.
- The dynamic symbol of the weak undefined symbol has type STT_WEAK, I
think the programmer is responsible for writing their weak call so
that it can handle the dynamic loader not being able to find the
symbol, such as the call_weak_fn in crti.S [**].
- If there is no dynamic linking then set the value of the undefined
weak reference to 0, or any special case like Arm or AArch64.

I'm deliberately glossing over implementation problems such as how do
we know there is no dynamic linking at the point we have to make a
PLT/GOT entry? I'll try and think about this a bit more tomorrow.

[*] References, __gmon_start__ is the PREINIT_FUNCTION:
AArch64 https://code.woboq.org/userspace/glibc/sysdeps/aarch64/crti.S.html
Arm https://code.woboq.org/userspace/glibc/sysdeps/arm/crti.S.html
X86_64 https://code.woboq.org/userspace/glibc/sysdeps/x86_64/crti.S.html

[**] The ELF spec says "The behavior of weak symbols in areas not
specified by this document is implementation defined. Weak symbols are
intended primarily for use in system software. Applications using weak
symbols are unreliable since changes in the runtime environment might
cause the execution to fail."

Peter

On 15 October 2017 at 14:55, Rui Ueyama via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> On Fri, Oct 13, 2017 at 11:27 PM, Rafael Avila de Espindola
> <rafael.espindola at gmail.com> wrote:
>>
>> Rui Ueyama <ruiu at google.com> writes:
>>
>> > I think the current behavior is bad. I'd like to propose the following
>> > changes:
>> >
>> > 1. If a linker is creating a non-PIC ELF binary, and if it finds a DSO
>> > symbol foo for an undefined weak symbol foo, then it adds foo as a
>> > *strong*
>> > undefined symbol to the dynamic symbol table. This prevents the above
>> > crash
>> > because the program fails to start if foo is not found at load-time,
>> > instead of crashing at run-time.
>> >
>> > 2. If a linker is creating a non-PIC ELF binary, and if it *cannot* find
>> > a
>> > DSO symbol foo for an undefined weak symbol foo, then it *does not* add
>> > foo to
>> > the dynamic symbol table, and it sets foo's value to zero.
>>
>> I would not phrase this as pic/non-pic. From the linker point of view
>> there are just relocations. I assume then that the intention is:
>
>
> We have -shared/-pie options, so my intention was to use these flags. We
> could use relocations to make a decision whether we should export an weak
> undefined symbols or not, but I think there are a few issues with that:
>
> 1. We cannot make a decision until we visit all relocations, but we need a
> decision beforehand in order to create GOT entries or report errors.
>
> 2. Sometimes we could get mixed signals -- for example, if some object file
> contains a direct reference to a weak symbol, and other object file contains
> a GOTPCREL reference to the same symbol, they are somewhat conflicting.
>
> So, just using -pie/-shared flags is simple, I guess?
>
>> -----------------------------------------------------------------
>> Sometimes a linker has to create a symbol in the main binary so that it
>> is preempted from a shared library at runtime. That symbol is then used
>> with a copy relocation if it is an object or a special plt entry if it
>> is a function.
>>
>> If the symbol in question was a weak undefined:
>>
>> * If the symbol was found in a .so the resulting undefined reference
>>   will be strong.
>> * If the symbol was not found in a .so, it is resolved to 0 and there is
>>   no undefined reference.
>>
>> If no relocation requires the symbol to be preempted to the main
>> executable (all relocations use a got for example) then there will still
>> be an weak undefined reference since the dynamic linker will be able to
>> handle the symbol existing or not.
>> -----------------------------------------------------------------
>>
>> I agree that that is probably a good change.
>>
>> Cheers,
>> Rafael
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>


More information about the llvm-dev mailing list