[llvm-dev] Potential bug in LLD when wrapping symbols
Martin Ramsdale via llvm-dev
llvm-dev at lists.llvm.org
Mon May 17 03:53:18 PDT 2021
Hi
There appears to be some bugs in LLD's handling of the '--wrap' option that
results in unexpected
"undefined symbols" in the output.
I've broken this report into { 1. Issues; 2. Reproduction; 3. Reproduction
Tests; 4: Side Observations }.
Please can somebody take a look at {1,2,3} to confirm if these are genuine
issues?
Many Thanks,
Martin
1) Issues:
==========
1.1) Using '--wrap x' results in undefined references to 'x' even when
'__real_x' isn't used in
the source code.
- i.e. The symbol table implies there is a dependency where one doesn't
exist
- Present in at least LLD-9, LLD-11
1.2) Using '--wrap x' where 'x' is in
"llvm-project/compiler-rt/lib/dfsan/done_abilist.txt"*
results in undefined references even when 'x' is never used
- i.e. The symbol table implies there is a dependency where one doesn't
exist, and exposes
internal details of the linker implementation
- *There a strong correlation with the symbols on this list, but has
NOT been proven as the
root cause(!)
- Present in at least LLD-9, LLD-11
Whilst investigating the above I also came across the following issue that
appears to be resolved
in LLD-11, but am recording here in case it is relevant, or helps anybody
else who encounters the
issue:
1.3) Using '--wrap x' results in undefined references to '__real_x'
- i.e. The symbol table implies there is a dependency where one doesn't
exist
- Present in at least LLD-9, and resolved via
https://reviews.llvm.org/D34993
2) Reproduction
===============
2.1) Source setup
-----------------
Source file, foo.c, which has function calls to:
a) bar_fn_not_wrapped
- A regular function, no wrapping
- Expect undefined symbols: bar_fn_not_wrapped
- Unexpected undefined symbols: __wrap_bar_fn_not_wrapped,
__real_bar_fn_not_wrapped
b) bar_fn_wrapped
– In the tests we’ll wrap this call
- Expect undefined symbols: __wrap_bar_fn_wrapped
- Unexpected undefined symbols: bar_fn_wrapped, __real_bar_fn_wrapped
c) gettimeofday
– In the tests we’ll wrap this call
- Expect undefined symbols: __wrap_gettimeofday
- Unexpected undefined symbols: gettimeofday, __real_gettimeofday
Also note foo.c does NOT call the following functions:
d) sigaction
– In the tests we’ll wrap this call
- Expected undefined symbols: <none>
- Unexpected undefined symbols: sigation, __wrap_sigaction,
__real_sigaction
e) bar_fn_other
– In the tests we’ll wrap this call
- Expected undefined symbols: <none>
- Unexpected undefined symbols: bar_fn_other, __wrap_bar_fn_other,
__real_bar_fn_other
This is all summarized in the table below, and we’ll use this to compare
against in the tests:
+--------------------+-----------------+-------------------+-----------------------------------+
| Symbol x | x used in foo.c | Wrapped in tests? | Expect
undefined symbol to ... ? |
| | | | x |
__wrap_x | __real_x |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
| bar_fn_not_wrapped | Y | N | Y | N
| N |
| bar_fn_wrapped | Y | Y | N | Y
| N |
| gettimeofday | Y | Y | N | Y
| N |
| sigaction | N | Y | N | N
| N |
| bar_fn_other | N | Y | N | N
| N |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
2.3) foo.c source code
----------------------
/* --- foo.c start --- */
#include <sys/time.h>
void bar_fn_not_wrapped(void);
void bar_fn_wrapped(void);
void
foo_fn (void)
{
struct timeval tv;
struct timezone tz;
bar_fn_wrapped();
bar_fn_not_wrapped();
(void)gettimeofday(&tv, &tz);
}
/* --- foo.c end --- */
2.3) Test setup
---------------
For each compiler/linker combination below we’ll run the following command:
$ <compiler> [ -fuse-ld=<optional-linker-choice ] -fPIC -shared foo.c
-Wl,--wrap=sigaction \
-Wl,--wrap=gettimeofday -Wl,--wrap=bar_fn_wrapped
-Wl,--wrap=bar_fn_other -o libfoo.so
And search for the interesting symbols using:
$ nm -D libfoo.so --undefined-only | grep -E "(sig|get|bar)" | tr -s ' '
| sed 's/^/ /'
The compiler/linkers used are:
a) gcc 4.7.0, gnu-ld
b) clang-9, gnu-ld
c) clang-9, llvm-lld-9
d) clang-11, llvm-lld-11
3) Reproduction Tests:
======================
NB: Bad results are highlighted with *Y* or *N*
a) gcc 4.7.0, gnu-ld
U bar_fn_not_wrapped
U __wrap_bar_fn_wrapped
U __wrap_gettimeofday
+--------------------+-----------------+-------------------+-----------------------------------+
| Symbol x | x used in foo.c | Wrapped in tests? | Undefined
symbol to ... ? |
| | | | x |
__wrap_x | __real_x |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
| bar_fn_not_wrapped | Y | N | Y |
N | N |
| bar_fn_wrapped | Y | Y | N |
Y | N |
| gettimeofday | Y | Y | N |
Y | N |
| sigaction | N | Y | N |
N | N |
| bar_fn_other | N | Y | N |
N | N |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
b) clang-9, gnu-ld
U bar_fn_not_wrapped
U __wrap_bar_fn_wrapped
U __wrap_gettimeofday
+--------------------+-----------------+-------------------+-----------------------------------+
| Symbol x | x used in foo.c | Wrapped in tests? | Undefined
symbol to ... ? |
| | | | x |
__wrap_x | __real_x |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
| bar_fn_not_wrapped | Y | N | Y |
N | N |
| bar_fn_wrapped | Y | Y | N |
Y | N |
| gettimeofday | Y | Y | N |
Y | N |
| sigaction | N | Y | N |
N | N |
| bar_fn_other | N | Y | N |
N | N |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
c) clang-9, llvm-lld-9
U bar_fn_not_wrapped
U bar_fn_wrapped
U gettimeofday
U __real_bar_fn_wrapped
U __real_gettimeofday
w __real_sigaction
w sigaction
U __wrap_bar_fn_wrapped
U __wrap_gettimeofday
U __wrap_sigaction
+--------------------+-----------------+-------------------+-----------------------------------+
| Symbol x | x used in foo.c | Wrapped in tests? | Undefined
symbol to ... ? |
| | | | x |
__wrap_x | __real_x |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
| bar_fn_not_wrapped | Y | N | Y |
N | N |
| bar_fn_wrapped | Y | Y | *Y* |
Y | *Y* |
| gettimeofday | Y | Y | *Y* |
Y | *Y* |
| sigaction | N | Y | *Y* |
*Y* | *Y* |
| bar_fn_other | N | Y | N |
N | N |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
d) clang-11, llvm-lld-11
U __wrap_bar_fn_wrapped
U __wrap_gettimeofday
U __wrap_sigaction
U bar_fn_not_wrapped
U bar_fn_wrapped
U gettimeofday
w sigaction
+--------------------+-----------------+-------------------+-----------------------------------+
| Symbol x | x used in foo.c | Wrapped in tests? | Undefined
symbol to ... ? |
| | | | x |
__wrap_x | __real_x |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
| bar_fn_not_wrapped | Y | N | Y |
N | N |
| bar_fn_wrapped | Y | Y | *Y* |
Y | N |
| gettimeofday | Y | Y | *Y* |
Y | N |
| sigaction | N | Y | *Y* |
*Y* | N |
| bar_fn_other | N | Y | N |
N | N |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
4) Side observations:
=====================
A few observations made whilst investigating the main issues. It's likely
that these won't be
critical to the main report, but are left here in case it aids discussion
on this topic:
a) I can’t find much online regarding this behaviour. One interesting
reference, although
doesn’t explain the above, is at
http://maskray.me/blog/2020-12-19-lld-and-gnu-linker-incompatibilities:
- """
Semantics of --wrap:
GNU ld hand LLD have slightly different --wrap semantics. I use
"slightly" because in most
use cases users will not observe a difference.
In GNU ld, --wrap only applies to undefined symbols. In LLD, --wrap
happens after all other
symbol resolution steps. The implementation is to mangle the symbol
table of each object
file (foo -> __wrap_foo; __real_foo -> foo) so that all relocations
to foo or __real_foo
will be redirected.
The LLD semantics have the advantage that non-LTO, LTO and
relocatable link behaviors are
consistent. I filed
https://sourceware.org/bugzilla/show_bug.cgi?id=26358 for GNU ld.
"""
- Looking at the corresponding bug
https://sourceware.org/bugzilla/show_bug.cgi?id=26358, the
suggestion is that LLD is more consistent, but when I’ve tried the
steps for their partial
linking example it demonstrates different behaviour.
b) I wonder if the run-time behaviour with these unexpected undefined
symbols is
impacted? e.g. what happens with -z,now? (Not investigated)
c) I wonder if the build-link-time behaviour is impacted. e.g. what
happens when linking with a
library dependency that has one of these missing symbol dependencies?
(Not investigated)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210517/7ea00e63/attachment-0001.html>
More information about the llvm-dev
mailing list