[llvm-dev] Potential bug in LLD when wrapping symbols

Martin Ramsdale via llvm-dev llvm-dev at lists.llvm.org
Mon May 17 04:49:47 PDT 2021


[ Self-reply to improve formatting due to line-length wrapping ]

Hi

There appears to be some bugs in LLD's handling of the '--wrap' option that
results in unexpected
"undefined symbols" in the output.

I've broken this report into { 1. Issues; 2. Reproduction; 3. Reproduction
Tests; 4: Side Observations }.
Please can somebody take a look at {1,2,3} to confirm if these are genuine
issues?

Many Thanks,
Martin

1) Issues:
==========
  1.1) Using '--wrap x' results in undefined references to 'x' even when
       '__real_x' isn't used inthe source code.
    - i.e. The symbol table implies there is a dependency where one doesn't
      exist
    - Present in at least LLD-9, LLD-11
  1.2) Using '--wrap x' where 'x' is in
       "llvm-project/compiler-rt/lib/dfsan/done_abilist.txt"*
       results in undefined references even when 'x' is never used
    - i.e. The symbol table implies there is a dependency where one doesn't
      exist, and exposes internal details of the linker implementation
    - *There a strong correlation with the symbols on this list, but has
      NOT been proven as the root cause(!)
    - Present in at least LLD-9, LLD-11

Whilst investigating the above I also came across the following issue that
appears to be resolved in LLD-11, but am recording here in case it is
relevant, or helps anybody else who encounters the issue:
  1.3) Using '--wrap x' results in undefined references to '__real_x'
    - i.e. The symbol table implies there is a dependency where one doesn't
      exist
    - Present in at least LLD-9, and resolved via
      https://reviews.llvm.org/D34993

2) Reproduction
===============

2.1) Source setup
-----------------

Source file, foo.c, which has function calls to:
  a) bar_fn_not_wrapped
    - A regular function, no wrapping
    - Expect undefined symbols: bar_fn_not_wrapped
    - Unexpected undefined symbols: __wrap_bar_fn_not_wrapped,
__real_bar_fn_not_wrapped
  b) bar_fn_wrapped
    – In the tests we’ll wrap this call
    - Expect undefined symbols: __wrap_bar_fn_wrapped
    - Unexpected undefined symbols: bar_fn_wrapped, __real_bar_fn_wrapped
  c) gettimeofday
    – In the tests we’ll wrap this call
    - Expect undefined symbols: __wrap_gettimeofday
    - Unexpected undefined symbols: gettimeofday, __real_gettimeofday
Also note foo.c does NOT call the following functions:
  d) sigaction
    – In the tests we’ll wrap this call
    - Expected undefined symbols: <none>
    - Unexpected undefined symbols: sigation, __wrap_sigaction,
__real_sigaction
  e) bar_fn_other
    – In the tests we’ll wrap this call
    - Expected undefined symbols: <none>
    - Unexpected undefined symbols: bar_fn_other, __wrap_bar_fn_other,
__real_bar_fn_other

This is all summarized in the table below, and we’ll use this to compare
against in the tests:
+--------------------+------+----------+--------------------------------+
| Symbol x           | Used | Wrapped? | Expect ...?                    |
|                    |      |          | x        | __wrap_x | __real_x |
+--------------------+------+----------+----------+----------+----------+
| bar_fn_not_wrapped | Y    | N        | Y        | N        | N        |
| bar_fn_wrapped     | Y    | Y        | N        | Y        | N        |
| gettimeofday       | Y    | Y        | N        | Y        | N        |
| sigaction          | N    | Y        | N        | N        | N        |
| bar_fn_other       | N    | Y        | N        | N        | N        |
+--------------------+------+----------+----------+----------+----------+

2.3) foo.c source code
----------------------
  /* --- foo.c start --- */
  #include <sys/time.h>

  void bar_fn_not_wrapped(void);
  void bar_fn_wrapped(void);

  void
  foo_fn (void)
  {
      struct timeval  tv;
      struct timezone tz;

      bar_fn_wrapped();
      bar_fn_not_wrapped();
      (void)gettimeofday(&tv, &tz);
  }
  /* --- foo.c end --- */


2.3) Test setup
---------------

For each compiler/linker combination below we’ll run the following command:
  $ <compiler> [ -fuse-ld=<optional-linker-choice ] -fPIC -shared foo.c
-Wl,--wrap=sigaction \
    -Wl,--wrap=gettimeofday -Wl,--wrap=bar_fn_wrapped
-Wl,--wrap=bar_fn_other -o libfoo.so
And search for the interesting symbols using:
  $ nm -D libfoo.so --undefined-only | grep -E "(sig|get|bar)" | tr -s ' '
| sed 's/^/  /'

The compiler/linkers used are:
 a) gcc 4.7.0, gnu-ld
 b) clang-9, gnu-ld
 c) clang-9, llvm-lld-9
 d) clang-11, llvm-lld-11

3) Reproduction Tests:
======================
NB: Bad results are highlighted with *Y* or *N*

a) gcc 4.7.0, gnu-ld
  U bar_fn_not_wrapped
  U __wrap_bar_fn_wrapped
  U __wrap_gettimeofday

+--------------------+------+----------+--------------------------------+
| Symbol x           | Used | Wrapped? | Undefined symbol ...?          |
|                    |      |          | x        | __wrap_x | __real_x |
+--------------------+------+----------+----------+----------+----------+
| bar_fn_not_wrapped | Y    | N        | Y        | N        | N        |
| bar_fn_wrapped     | Y    | Y        | N        | Y        | N        |
| gettimeofday       | Y    | Y        | N        | Y        | N        |
| sigaction          | N    | Y        | N        | N        | N        |
| bar_fn_other       | N    | Y        | N        | N        | N        |
+--------------------+------+----------+----------+----------+----------+

b) clang-9, gnu-ld
  U bar_fn_not_wrapped
  U __wrap_bar_fn_wrapped
  U __wrap_gettimeofday

+--------------------+------+----------+--------------------------------+
| Symbol x           | Used | Wrapped? | Undefined symbol ...?          |
|                    |      |          | x        | __wrap_x | __real_x |
+--------------------+------+----------+----------+----------+----------+
| bar_fn_not_wrapped | Y    | N        | Y        | N        | N        |
| bar_fn_wrapped     | Y    | Y        | N        | Y        | N        |
| gettimeofday       | Y    | Y        | N        | Y        | N        |
| sigaction          | N    | Y        | N        | N        | N        |
| bar_fn_other       | N    | Y        | N        | N        | N        |
+--------------------+------+----------+----------+----------+----------+

c) clang-9, llvm-lld-9
  U bar_fn_not_wrapped
  U bar_fn_wrapped
  U gettimeofday
  U __real_bar_fn_wrapped
  U __real_gettimeofday
  w __real_sigaction
  w sigaction
  U __wrap_bar_fn_wrapped
  U __wrap_gettimeofday
  U __wrap_sigaction

+--------------------+------+----------+--------------------------------+
| Symbol x           | Used | Wrapped? | Undefined symbol ...?          |
|                    |      |          | x        | __wrap_x | __real_x |
+--------------------+------+----------+----------+----------+----------+
| bar_fn_not_wrapped | Y    | N        | Y        | N        | N        |
| bar_fn_wrapped     | Y    | Y        | *Y*      | Y        | *Y*      |
| gettimeofday       | Y    | Y        | *Y*      | Y        | *Y*      |
| sigaction          | N    | Y        | *Y*      | *Y*      | *Y*      |
| bar_fn_other       | N    | Y        | N        | N        | N        |
+--------------------+------+----------+----------+----------+----------+

d) clang-11, llvm-lld-11
   U __wrap_bar_fn_wrapped
   U __wrap_gettimeofday
   U __wrap_sigaction
   U bar_fn_not_wrapped
   U bar_fn_wrapped
   U gettimeofday
   w sigaction

+--------------------+------+----------+--------------------------------+
| Symbol x           | Used | Wrapped? | Undefined symbol ...?          |
|                    |      |          | x        | __wrap_x | __real_x |
+--------------------+------+----------+----------+----------+----------+
| bar_fn_not_wrapped | Y    | N        | Y        | N        | N        |
| bar_fn_wrapped     | Y    | Y        | *Y*      | Y        | N        |
| gettimeofday       | Y    | Y        | *Y*      | Y        | N        |
| sigaction          | N    | Y        | *Y*      | *Y*      | N        |
| bar_fn_other       | N    | Y        | N        | N        | N        |
+--------------------+------+----------+----------+----------+----------+

4) Side observations:
=====================
A few observations made whilst investigating the main issues. It's likely
that these won't be critical to the main report, but are left here in
case it aids discussion on this topic:

  a) I can’t find much online regarding this behaviour. One interesting
     reference, although doesn’t explain the above, is at
http://maskray.me/blog/2020-12-19-lld-and-gnu-linker-incompatibilities:
    - """
      Semantics of --wrap:
      GNU ld hand LLD have slightly different --wrap semantics. I use
      "slightly" because in most use cases users will not observe a
      difference. In GNU ld, --wrap only applies to undefined symbols.
      In LLD, --wrap happens after all other symbol resolution steps.
      The implementation is to mangle the symbol table of each object
      file (foo -> __wrap_foo; __real_foo -> foo) so that all relocations
      to foo or __real_foo will be redirected. The LLD semantics have
      the advantage that non-LTO, LTO and relocatable link behaviors are
      consistent. I filed
      https://sourceware.org/bugzilla/show_bug.cgi?id=26358 for GNU ld.
      """
    - Looking at the corresponding bug
      https://sourceware.org/bugzilla/show_bug.cgi?id=26358, the
      suggestion is that LLD is more consistent, but when I’ve tried the
      steps for their partial linking example it demonstrates different
      behaviour.
  b) I wonder if the run-time behaviour with these unexpected undefined
     symbols is impacted? e.g. what happens with -z,now?
     (Not investigated)
  c) I wonder if the build-link-time behaviour is impacted. e.g. what
     happens when linking with a library dependency that has one of
     these missing symbol dependencies?
    (Not investigated)


More information about the llvm-dev mailing list