[llvm-dev] Potential bug in LLD when wrapping symbols

Martin Ramsdale via llvm-dev llvm-dev at lists.llvm.org
Mon May 17 03:53:18 PDT 2021


Hi

There appears to be some bugs in LLD's handling of the '--wrap' option that
results in unexpected
"undefined symbols" in the output.

I've broken this report into { 1. Issues; 2. Reproduction; 3. Reproduction
Tests; 4: Side Observations }.
Please can somebody take a look at {1,2,3} to confirm if these are genuine
issues?

Many Thanks,
Martin

1) Issues:
==========
  1.1) Using '--wrap x' results in undefined references to 'x' even when
'__real_x' isn't used in
     the source code.
    - i.e. The symbol table implies there is a dependency where one doesn't
exist
    - Present in at least LLD-9, LLD-11
  1.2) Using '--wrap x' where 'x' is in
"llvm-project/compiler-rt/lib/dfsan/done_abilist.txt"*
     results in undefined references even when 'x' is never used
    - i.e. The symbol table implies there is a dependency where one doesn't
exist, and exposes
      internal details of the linker implementation
    - *There a strong correlation with the symbols on this list, but has
NOT been proven as the
      root cause(!)
    - Present in at least LLD-9, LLD-11

Whilst investigating the above I also came across the following issue that
appears to be resolved
in LLD-11, but am recording here in case it is relevant, or helps anybody
else who encounters the
issue:
  1.3) Using '--wrap x' results in undefined references to '__real_x'
    - i.e. The symbol table implies there is a dependency where one doesn't
exist
    - Present in at least LLD-9, and resolved via
https://reviews.llvm.org/D34993

2) Reproduction
===============

2.1) Source setup
-----------------

Source file, foo.c, which has function calls to:
  a) bar_fn_not_wrapped
    - A regular function, no wrapping
    - Expect undefined symbols: bar_fn_not_wrapped
    - Unexpected undefined symbols: __wrap_bar_fn_not_wrapped,
__real_bar_fn_not_wrapped
  b) bar_fn_wrapped
    – In the tests we’ll wrap this call
    - Expect undefined symbols: __wrap_bar_fn_wrapped
    - Unexpected undefined symbols: bar_fn_wrapped, __real_bar_fn_wrapped
  c) gettimeofday
    – In the tests we’ll wrap this call
    - Expect undefined symbols: __wrap_gettimeofday
    - Unexpected undefined symbols: gettimeofday, __real_gettimeofday
Also note foo.c does NOT call the following functions:
  d) sigaction
    – In the tests we’ll wrap this call
    - Expected undefined symbols: <none>
    - Unexpected undefined symbols: sigation, __wrap_sigaction,
__real_sigaction
  e) bar_fn_other
    – In the tests we’ll wrap this call
    - Expected undefined symbols: <none>
    - Unexpected undefined symbols: bar_fn_other, __wrap_bar_fn_other,
__real_bar_fn_other

This is all summarized in the table below, and we’ll use this to compare
against in the tests:
+--------------------+-----------------+-------------------+-----------------------------------+
| Symbol x           | x used in foo.c | Wrapped in tests? | Expect
undefined symbol to ... ?  |
|                    |                 |                   | x         |
__wrap_x  | __real_x  |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+
| bar_fn_not_wrapped | Y               | N                 | Y         | N
        | N         |
| bar_fn_wrapped     | Y               | Y                 | N         | Y
        | N         |
| gettimeofday       | Y               | Y                 | N         | Y
        | N         |
| sigaction          | N               | Y                 | N         | N
        | N         |
| bar_fn_other       | N               | Y                 | N         | N
        | N         |
+--------------------+-----------------+-------------------+-----------+-----------+-----------+

2.3) foo.c source code
----------------------
  /* --- foo.c start --- */
  #include <sys/time.h>

  void bar_fn_not_wrapped(void);
  void bar_fn_wrapped(void);

  void
  foo_fn (void)
  {
      struct timeval  tv;
      struct timezone tz;

      bar_fn_wrapped();
      bar_fn_not_wrapped();
      (void)gettimeofday(&tv, &tz);
  }
  /* --- foo.c end --- */


2.3) Test setup
---------------

For each compiler/linker combination below we’ll run the following command:
  $ <compiler> [ -fuse-ld=<optional-linker-choice ] -fPIC -shared foo.c
-Wl,--wrap=sigaction \
    -Wl,--wrap=gettimeofday -Wl,--wrap=bar_fn_wrapped
-Wl,--wrap=bar_fn_other -o libfoo.so
And search for the interesting symbols using:
  $ nm -D libfoo.so --undefined-only | grep -E "(sig|get|bar)" | tr -s ' '
| sed 's/^/  /'

The compiler/linkers used are:
 a) gcc 4.7.0, gnu-ld
 b) clang-9, gnu-ld
 c) clang-9, llvm-lld-9
 d) clang-11, llvm-lld-11

3) Reproduction Tests:
======================
NB: Bad results are highlighted with *Y* or *N*

a) gcc 4.7.0, gnu-ld
  U bar_fn_not_wrapped
  U __wrap_bar_fn_wrapped
  U __wrap_gettimeofday


+--------------------+-----------------+-------------------+-----------------------------------+
  | Symbol x           | x used in foo.c | Wrapped in tests? | Undefined
symbol to ... ?         |
  |                    |                 |                   | x         |
__wrap_x  | __real_x  |

+--------------------+-----------------+-------------------+-----------+-----------+-----------+
  | bar_fn_not_wrapped | Y               | N                 | Y         |
N         | N         |
  | bar_fn_wrapped     | Y               | Y                 | N         |
Y         | N         |
  | gettimeofday       | Y               | Y                 | N         |
Y         | N         |
  | sigaction          | N               | Y                 | N         |
N         | N         |
  | bar_fn_other       | N               | Y                 | N         |
N         | N         |

+--------------------+-----------------+-------------------+-----------+-----------+-----------+

b) clang-9, gnu-ld
  U bar_fn_not_wrapped
  U __wrap_bar_fn_wrapped
  U __wrap_gettimeofday


+--------------------+-----------------+-------------------+-----------------------------------+
  | Symbol x           | x used in foo.c | Wrapped in tests? | Undefined
symbol to ... ?         |
  |                    |                 |                   | x         |
__wrap_x  | __real_x  |

+--------------------+-----------------+-------------------+-----------+-----------+-----------+
  | bar_fn_not_wrapped | Y               | N                 | Y         |
N         | N         |
  | bar_fn_wrapped     | Y               | Y                 | N         |
Y         | N         |
  | gettimeofday       | Y               | Y                 | N         |
Y         | N         |
  | sigaction          | N               | Y                 | N         |
N         | N         |
  | bar_fn_other       | N               | Y                 | N         |
N         | N         |

+--------------------+-----------------+-------------------+-----------+-----------+-----------+

c) clang-9, llvm-lld-9
  U bar_fn_not_wrapped
  U bar_fn_wrapped
  U gettimeofday
  U __real_bar_fn_wrapped
  U __real_gettimeofday
  w __real_sigaction
  w sigaction
  U __wrap_bar_fn_wrapped
  U __wrap_gettimeofday
  U __wrap_sigaction


+--------------------+-----------------+-------------------+-----------------------------------+
  | Symbol x           | x used in foo.c | Wrapped in tests? | Undefined
symbol to ... ?         |
  |                    |                 |                   | x         |
__wrap_x  | __real_x  |

+--------------------+-----------------+-------------------+-----------+-----------+-----------+
  | bar_fn_not_wrapped | Y               | N                 | Y         |
N         | N         |
  | bar_fn_wrapped     | Y               | Y                 | *Y*       |
Y         | *Y*       |
  | gettimeofday       | Y               | Y                 | *Y*       |
Y         | *Y*       |
  | sigaction          | N               | Y                 | *Y*       |
*Y*       | *Y*       |
  | bar_fn_other       | N               | Y                 | N         |
N         | N         |

+--------------------+-----------------+-------------------+-----------+-----------+-----------+

d) clang-11, llvm-lld-11
   U __wrap_bar_fn_wrapped
   U __wrap_gettimeofday
   U __wrap_sigaction
   U bar_fn_not_wrapped
   U bar_fn_wrapped
   U gettimeofday
   w sigaction


+--------------------+-----------------+-------------------+-----------------------------------+
  | Symbol x           | x used in foo.c | Wrapped in tests? | Undefined
symbol to ... ?         |
  |                    |                 |                   | x         |
__wrap_x  | __real_x  |

+--------------------+-----------------+-------------------+-----------+-----------+-----------+
  | bar_fn_not_wrapped | Y               | N                 | Y         |
N         | N         |
  | bar_fn_wrapped     | Y               | Y                 | *Y*       |
Y         | N         |
  | gettimeofday       | Y               | Y                 | *Y*       |
Y         | N         |
  | sigaction          | N               | Y                 | *Y*       |
*Y*       | N         |
  | bar_fn_other       | N               | Y                 | N         |
N         | N         |

+--------------------+-----------------+-------------------+-----------+-----------+-----------+

4) Side observations:
=====================
A few observations made whilst investigating the main issues. It's likely
that these won't be
critical to the main report, but are left here in case it aids discussion
on this topic:

  a) I can’t find much online regarding this behaviour. One interesting
reference, although
     doesn’t explain the above, is at
http://maskray.me/blog/2020-12-19-lld-and-gnu-linker-incompatibilities:
    - """
      Semantics of --wrap:
      GNU ld hand LLD have slightly different --wrap semantics. I use
"slightly" because in most
      use cases users will not observe a difference.
      In GNU ld, --wrap only applies to undefined symbols. In LLD, --wrap
happens after all other
      symbol resolution steps. The implementation is to mangle the symbol
table of each object
      file (foo -> __wrap_foo; __real_foo -> foo) so that all relocations
to foo or __real_foo
      will be redirected.
      The LLD semantics have the advantage that non-LTO, LTO and
relocatable link behaviors are
      consistent. I filed
https://sourceware.org/bugzilla/show_bug.cgi?id=26358 for GNU ld.
      """
    - Looking at the corresponding bug
https://sourceware.org/bugzilla/show_bug.cgi?id=26358, the
      suggestion is that LLD is more consistent, but when I’ve tried the
steps for their partial
      linking example it demonstrates different behaviour.
  b) I wonder if the run-time behaviour with these unexpected undefined
symbols is
     impacted? e.g. what happens with -z,now? (Not investigated)
  c) I wonder if the build-link-time behaviour is impacted. e.g. what
happens when linking with a
     library dependency that has one of these missing symbol dependencies?
(Not investigated)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210517/7ea00e63/attachment-0001.html>


More information about the llvm-dev mailing list