[llvm-dev] [LLD] Incorrect comparision of pointers to function defined in DSO

Sean Silva via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 8 14:52:15 PST 2016


Sounds like it is related to this:

http://www.airs.com/blog/archives/42
"""

The fact that C permits taking the address of a function introduces an
interesting wrinkle. In C you are permitted to take the address of a
function, and you are permitted to compare that address to another function
address. The problem is that if you take the address of a function in a
shared library, the natural result would be to get the address of the PLT
entry. After all, that is address to which a call to the function will
jump. However, each shared library has its own PLT, and thus the address of
a particular function would differ in each shared library. That means that
comparisons of function pointers generated in different shraed libraries
may be different when they should be the same. This is not a purely
hypothetical problem; when I did a port which got it wrong, before I fixed
the bug I saw failures in the Tcl shared library when it compared function
pointers.

The fix for this bug on most processors is a special marking for a symbol
which has a PLT entry but is not defined. Typically the symbol will be
marked as undefined, but with a non-zero value–the value will be set to the
address of the PLT entry. When the dynamic linker is searching for the
value of a symbol to use for a reloc other than a JMP_SLOT reloc, if it
finds such a specially marked symbol, it will use the non-zero value. This
will ensure that all references to the symbol which are not function calls
will use the same value. To make this work, the compiler and assembler must
make sure that any reference to a function which does not involve calling
it will not carry a standard PLT reloc. This special handling of function
addresses needs to be implemented in both the program linker and the
dynamic linker.
"""


Indeed, comparing the `llvm-readobj -dyn-symbols` output on the executables
from gold and lld, I see:


--- a.out.gold.readobj 2016-02-08 14:08:52.678160575 -0800
+++ a.out.lld.readobj 2016-02-08 14:08:52.678160575 -0800

   Symbol {
-    Name: set_data@ (142)
-    Value: 0x400560
-    Size: 0
+    Name: set_data@ (46)
+    Value: 0x0
+    Size: 10
     Binding: Global (0x1)
     Type: Function (0x2)
     Other: 0
     Section: Undefined (0x0)
   }

You can also see this in LD_DEBUG=all when running the executables (to
avoid extraneous diffs, both executables are called "./a.out.lld"; look at
the diff header to know which is output from the gold executable vs lld
executable):

--- ld_debug-a.out.gold 2016-02-08 14:07:27.255734743 -0800
+++ ld_debug-a.out.lld 2016-02-08 14:07:27.255734743 -0800
       relocation processing: ./libdump.so (lazy)
      symbol=set_data;  lookup in file=./a.out.lld [0]
-     binding file ./libdump.so [0] to ./a.out.lld [0]: normal symbol
`set_data'
+     symbol=set_data;  lookup in file=./libdump.so [0]
+     binding file ./libdump.so [0] to ./libdump.so [0]: normal symbol
`set_data'

For gold, the symbol is bound to the one in a.out (the PLT entry), while
for lld it is bound to the one in libdump.so.

-- Sean Silva

On Mon, Feb 8, 2016 at 7:55 AM, Simon Atanasyan via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi,
>
> It looks like I have found a bug in LLD. Suppose DSO defines a global
> variable 'data' and initializes it by the address of function
> 'set_data' defined in the same DSO. If an executable file (linked by
> LLD) gets address of the '&set_data' function and compares it with a
> value stored in the 'data' variable it gets different result. If the
> executable is linked by BFD or Gold linker it gets the same result.
>
> Right now I do not have a time to investigate this problem further. I
> will plan to do that later. But maybe the reason of this problem is
> obvious to somebody?
>
> The reproduction script:
>
> % cat so.c
> void set_data(void *v) {}
> void *data = &set_data;
>
> % cat main.c
> int printf(const char *, ...);
>
> extern void *data;
> void set_data(void *v);
>
> int main(void)
> {
>   printf("%p = %p\n", &set_data, data);
> }
>
> % clang -fPIC -shared so.c -o libdump.so
> % clang -c main.c
>
> % clang main.o -Wl,-rpath -Wl,. -L. -ldump
> % ./a.out
> 0x400600 = 0x400600  # The same addresses
>
> % lld -flavor gnu --sysroot=/ --build-id --no-add-needed --eh-frame-hdr \
>     -m elf_x86_64 --hash-style=both \
>     -dynamic-linker /lib64/ld-linux-x86-64.so.2 \
>     /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \
>     /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o \
>     -L. -L/usr/lib/gcc/x86_64-linux-gnu/4.7 \
>     -L/usr/lib/x86_64-linux-gnu \
>     -L/usr/lib -L/lib/x86_64-linux-gnu -L/lib \
>     -L/usr/lib/x86_64-linux-gnu -L/usr/lib \
>     main.o -rpath . -ldump -lgcc --as-needed -lgcc_s --no-as-needed \
>     -lc -lgcc --as-needed -lgcc_s --no-as-needed \
>     /usr/lib/gcc/x86_64-linux-gnu/4.7/crtend.o \
>     /usr/lib/x86_64-linux-gnu/crtn.o
> % ./a.out
> 0x11250 = 0x7f02915bd6b0 # garbage in 'data'
>
> --
> Simon Atanasyan
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160208/dcf25565/attachment.html>


More information about the llvm-dev mailing list