[llvm-dev] [LLD] Incorrect comparision of pointers to function defined in DSO

Rafael Espíndola via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 8 15:39:31 PST 2016


No, it was not implemented, I just added a note:

    // The remaining (unimplemented) problem is making sure pointer
equality
    // still works. We need the help of the dynamic linker for that.
We
    // let it know that we have a direct reference to a so symbol by
creating
    // an undefined symbol with a non zero st_value. Seeing that, the
    // dynamic linker resolves the symbol to the value of the symbol
we created.
    // This is true even for got entries, so pointer equality is
maintained.
    // To avoid an infinite loop, the only entry that points to the
    // real function is a dedicated got entry used by the plt. That is
    // identified by special relocation types (R_X86_64_JUMP_SLOT,
    // R_386_JMP_SLOT, etc).

Taking a look.



On 8 February 2016 at 18:14, Rafael Espíndola
<rafael.espindola at gmail.com> wrote:
> Yes, I had just reduced it too.
>
> I am pretty sure this was implemented at some point. Taking a look.
>
> Cheers,
> Rafael
>
>
> On 8 February 2016 at 17:52, Sean Silva <chisophugis at gmail.com> wrote:
>> Sounds like it is related to this:
>>
>> http://www.airs.com/blog/archives/42
>> """
>>
>> The fact that C permits taking the address of a function introduces an
>> interesting wrinkle. In C you are permitted to take the address of a
>> function, and you are permitted to compare that address to another function
>> address. The problem is that if you take the address of a function in a
>> shared library, the natural result would be to get the address of the PLT
>> entry. After all, that is address to which a call to the function will jump.
>> However, each shared library has its own PLT, and thus the address of a
>> particular function would differ in each shared library. That means that
>> comparisons of function pointers generated in different shraed libraries may
>> be different when they should be the same. This is not a purely hypothetical
>> problem; when I did a port which got it wrong, before I fixed the bug I saw
>> failures in the Tcl shared library when it compared function pointers.
>>
>> The fix for this bug on most processors is a special marking for a symbol
>> which has a PLT entry but is not defined. Typically the symbol will be
>> marked as undefined, but with a non-zero value–the value will be set to the
>> address of the PLT entry. When the dynamic linker is searching for the value
>> of a symbol to use for a reloc other than a JMP_SLOT reloc, if it finds such
>> a specially marked symbol, it will use the non-zero value. This will ensure
>> that all references to the symbol which are not function calls will use the
>> same value. To make this work, the compiler and assembler must make sure
>> that any reference to a function which does not involve calling it will not
>> carry a standard PLT reloc. This special handling of function addresses
>> needs to be implemented in both the program linker and the dynamic linker.
>>
>> """
>>
>>
>> Indeed, comparing the `llvm-readobj -dyn-symbols` output on the executables
>> from gold and lld, I see:
>>
>>
>> --- a.out.gold.readobj 2016-02-08 14:08:52.678160575 -0800
>> +++ a.out.lld.readobj 2016-02-08 14:08:52.678160575 -0800
>>
>>    Symbol {
>> -    Name: set_data@ (142)
>> -    Value: 0x400560
>> -    Size: 0
>> +    Name: set_data@ (46)
>> +    Value: 0x0
>> +    Size: 10
>>      Binding: Global (0x1)
>>      Type: Function (0x2)
>>      Other: 0
>>      Section: Undefined (0x0)
>>    }
>>
>> You can also see this in LD_DEBUG=all when running the executables (to avoid
>> extraneous diffs, both executables are called "./a.out.lld"; look at the
>> diff header to know which is output from the gold executable vs lld
>> executable):
>>
>> --- ld_debug-a.out.gold 2016-02-08 14:07:27.255734743 -0800
>> +++ ld_debug-a.out.lld 2016-02-08 14:07:27.255734743 -0800
>>        relocation processing: ./libdump.so (lazy)
>>       symbol=set_data;  lookup in file=./a.out.lld [0]
>> -     binding file ./libdump.so [0] to ./a.out.lld [0]: normal symbol
>> `set_data'
>> +     symbol=set_data;  lookup in file=./libdump.so [0]
>> +     binding file ./libdump.so [0] to ./libdump.so [0]: normal symbol
>> `set_data'
>>
>> For gold, the symbol is bound to the one in a.out (the PLT entry), while for
>> lld it is bound to the one in libdump.so.
>>
>> -- Sean Silva
>>
>> On Mon, Feb 8, 2016 at 7:55 AM, Simon Atanasyan via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>>
>>> Hi,
>>>
>>> It looks like I have found a bug in LLD. Suppose DSO defines a global
>>> variable 'data' and initializes it by the address of function
>>> 'set_data' defined in the same DSO. If an executable file (linked by
>>> LLD) gets address of the '&set_data' function and compares it with a
>>> value stored in the 'data' variable it gets different result. If the
>>> executable is linked by BFD or Gold linker it gets the same result.
>>>
>>> Right now I do not have a time to investigate this problem further. I
>>> will plan to do that later. But maybe the reason of this problem is
>>> obvious to somebody?
>>>
>>> The reproduction script:
>>>
>>> % cat so.c
>>> void set_data(void *v) {}
>>> void *data = &set_data;
>>>
>>> % cat main.c
>>> int printf(const char *, ...);
>>>
>>> extern void *data;
>>> void set_data(void *v);
>>>
>>> int main(void)
>>> {
>>>   printf("%p = %p\n", &set_data, data);
>>> }
>>>
>>> % clang -fPIC -shared so.c -o libdump.so
>>> % clang -c main.c
>>>
>>> % clang main.o -Wl,-rpath -Wl,. -L. -ldump
>>> % ./a.out
>>> 0x400600 = 0x400600  # The same addresses
>>>
>>> % lld -flavor gnu --sysroot=/ --build-id --no-add-needed --eh-frame-hdr \
>>>     -m elf_x86_64 --hash-style=both \
>>>     -dynamic-linker /lib64/ld-linux-x86-64.so.2 \
>>>     /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \
>>>     /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o \
>>>     -L. -L/usr/lib/gcc/x86_64-linux-gnu/4.7 \
>>>     -L/usr/lib/x86_64-linux-gnu \
>>>     -L/usr/lib -L/lib/x86_64-linux-gnu -L/lib \
>>>     -L/usr/lib/x86_64-linux-gnu -L/usr/lib \
>>>     main.o -rpath . -ldump -lgcc --as-needed -lgcc_s --no-as-needed \
>>>     -lc -lgcc --as-needed -lgcc_s --no-as-needed \
>>>     /usr/lib/gcc/x86_64-linux-gnu/4.7/crtend.o \
>>>     /usr/lib/x86_64-linux-gnu/crtn.o
>>> % ./a.out
>>> 0x11250 = 0x7f02915bd6b0 # garbage in 'data'
>>>
>>> --
>>> Simon Atanasyan
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>


More information about the llvm-dev mailing list