[LLVMbugs] [Bug 15086] New: Tailcall optimization in PIC mode on x86 uses GOT, making lazy dynamic linking impossible

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Sun Jan 27 14:20:15 PST 2013


http://llvm.org/bugs/show_bug.cgi?id=15086

             Bug #: 15086
           Summary: Tailcall optimization in PIC mode on x86 uses GOT,
                    making lazy dynamic linking impossible
           Product: clang
           Version: trunk
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: -New Bugs
        AssignedTo: unassignedclangbugs at nondot.org
        ReportedBy: dimitry at andric.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified


Created attachment 9933
  --> http://llvm.org/bugs/attachment.cgi?id=9933
Testcase for tailcall optimization problem

I had several reports about certain X.org drivers not loading properly
in FreeBSD 10.0-CURRENT, after the default compiler was switched to
clang 3.2.

This occurred only on i386, and after some debugging, it turned out to
be due to the way clang optimizes tail calls in PIC mode, in combination
with the typical way X.org loads its driver modules.

As an example, X.org loads the VMware display driver, vmware_drv.so,
which is a perfectly normal shared library.  This driver needs the
function vgaHWSaveScreen(), from the support library libvgahw.so, which
is a default component of the X.org server itself.

The loading should proceed as follows:
1) xorg-server calls dlopen(..., RTLD_LAZY) to open vmware_drv.so.  Note
   vmware_drv.so does *not* contain libvgahw.so in its dynamic section,
   to libvgahw.so is not automatically loaded here.
2) xorg-server resolves and calls the driver initialization function in
   vmware_drv.so.
3) The driver initialization function calls dlopen(..., RTLD_LAZY) to
   open libvgahw.so.  Then it resolves several functions, one of which
   is the above mentioned vgaHWSaveScreen().
4) The driver initialization function calls vgaHWSaveScreen().

However, when the driver is compiled by clang at -O2, with -fPIC, the
driver will fail to load at step 1), with:

  [  9820.637] (II) Loading /usr/local/lib/xorg/modules/drivers/vmware_drv.so
  [  9820.644] (EE) Failed to load
                    /usr/local/lib/xorg/modules/drivers/vmware_drv.so:
                    /usr/local/lib/xorg/modules/drivers/vmware_drv.so:
                    Undefined symbol "vgaHWSaveScreen"

In this particular case, the VMware driver contains a function which
tail calls vgaHWSaveScreen():

  static Bool
  VMWARESaveScreen(ScreenPtr pScreen, int mode)
  {
      VmwareLog(("VMWareSaveScreen() mode = %d\n", mode));

      /*
       * This thoroughly fails to do anything useful to svga mode.  I doubt
       * we care; who wants to idle-blank their VM's screen anyway?
       */
      return vgaHWSaveScreen(pScreen, mode);
  }

clang with -fPIC and -O2 turns this into the following:

  VMWARESaveScreen:                       # @VMWARESaveScreen
  # BB#0:                                 # %entry
          calll   .L26$pb
  .L26$pb:
          popl    %eax
  .Ltmp49:
          addl    $_GLOBAL_OFFSET_TABLE_+(.Ltmp49-.L26$pb), %eax
          movl    vgaHWSaveScreen at GOT(%eax), %eax
          jmpl    *%eax                   # TAILCALL

So it loads the address of vgaHWSaveScreen() via the GOT.  In contrast,
gcc still uses the PLT, even for a tail call:

  VMWARESaveScreen:
          pushl   %ebp
          movl    %esp, %ebp
          pushl   %ebx
          call    __i686.get_pc_thunk.bx
          addl    $_GLOBAL_OFFSET_TABLE_, %ebx
          subl    $20, %esp
          movl    12(%ebp), %eax
          movl    %eax, 4(%esp)
          movl    8(%ebp), %eax
          movl    %eax, (%esp)
          call    vgaHWSaveScreen at PLT
          addl    $20, %esp
          popl    %ebx
          leave
          ret

The problem here is that GOT relocations can only be resolved directly
at dlopen() time, and cannot be done lazily, like PLT ones.  Usually,
the PLT entries point to stubs that check if the actual function is
already resolved, and if not, call the dynamic linker guts to do
just-in-time resolving.

I am convinced clang should not use GOT relocations here, but mimic
gcc's behaviour, and use PLT relocations instead.  Though I am not sure
if a real tail call (using a jump) is still possible then.

In any case, this happens with any X.org driver or module that happens
to use a tail call to call one of these support library functions.  For
example, a similar error is generated when X.org tries to load the vesa
driver:

  [  9820.645] (II) Loading /usr/local/lib/xorg/modules/drivers/vesa_drv.so
  [  9820.660] (EE) Failed to load
                    /usr/local/lib/xorg/modules/drivers/vesa_drv.so:
                    /usr/local/lib/xorg/modules/drivers/vesa_drv.so:
                    Undefined symbol "shadowUpdatePacked"

Furthermore, these errors at loading time disappear when you add
-fno-optimize-sibling-calls to the optimization flags.  Then clang
generates the following for the VMWARESaveScreen() function:

  VMWARESaveScreen:                       # @VMWARESaveScreen
  # BB#0:                                 # %entry
      pushl   %ebp
      movl    %esp, %ebp
      pushl   %ebx
      subl    $8, %esp
      calll   .L26$pb
  .L26$pb:
      popl    %ebx
  .Ltmp49:
      addl    $_GLOBAL_OFFSET_TABLE_+(.Ltmp49-.L26$pb), %ebx
      movl    12(%ebp), %eax
      movl    %eax, 4(%esp)
      movl    8(%ebp), %eax
      movl    %eax, (%esp)
      calll   vgaHWSaveScreen at PLT
      addl    $8, %esp
      popl    %ebx
      popl    %ebp
      ret

I have attached a self-contained testcase that mimics the way X.org
loads this driver.  It can be run by unpacking the tarball, and typing
"make", assuming that is GNU make, but it should also work with BSD
make.

There is a main program that uses dlopen() to load driver.so, which in
turn uses dlopen() to load submod.so.  The submod.so library contains a
vgaHWSaveScreen() function, which is tail called from a
VMWARESaveScreen() function in driver.so.

When clang optimizes the tail call into a GOT relocation, the driver.so
library will fail to load.

Here is a transcript of a test run with gcc on a recent Ubuntu:

  $ ls -l
  total 24
  -rw-rw-r-- 1 dim dim 578 Jan 27 21:30 driver.c
  -rw-rw-r-- 1 dim dim 121 Jan 27 21:30 driver.h
  -rw-rw-r-- 1 dim dim 868 Jan 27 21:30 main.c
  -rw-rw-r-- 1 dim dim 386 Jan 27 22:22 Makefile
  -rw-rw-r-- 1 dim dim 143 Jan 27 21:30 submod.c
  -rw-rw-r-- 1 dim dim 126 Jan 27 21:30 submod.h
  $ make CC=gcc
  gcc -m32 -O2 -fPIC -DPIC -Wall -Wextra  main.c -o main -ldl
  gcc -m32 -O2 -fPIC -DPIC -Wall -Wextra  -shared driver.c -o driver.so
  gcc -m32 -O2 -fPIC -DPIC -Wall -Wextra  -shared submod.c -o submod.so
  LD_LIBRARY_PATH=. ./main
  main: Loading ./driver.so...
  main: Searching function driver_init...
  main: Calling function driver_init...
  driver_init: Loading ./submod.so...
  vgaHWSaveScreen((nil), 42)...
  main: Result 84, unloading ./driver.so...

This is what is expected.  Then a run with clang 3.2 release:

  $ make clean
  rm -f main driver.so submod.so
  $ make CC=~/Downloads/clang+llvm-3.2-x86_64-linux-ubuntu-12.04/bin/clang
  /home/dim/Downloads/clang+llvm-3.2-x86_64-linux-ubuntu-12.04/bin/clang -m32
-O2 -fPIC -DPIC -Wall -Wextra  main.c -o main -ldl
  /home/dim/Downloads/clang+llvm-3.2-x86_64-linux-ubuntu-12.04/bin/clang -m32
-O2 -fPIC -DPIC -Wall -Wextra  -shared driver.c -o driver.so
  /home/dim/Downloads/clang+llvm-3.2-x86_64-linux-ubuntu-12.04/bin/clang -m32
-O2 -fPIC -DPIC -Wall -Wextra  -shared submod.c -o submod.so
  LD_LIBRARY_PATH=. ./main
  main: Loading ./driver.so...
  main: Unable to dlopen ./driver.so: ./driver.so: undefined symbol:
vgaHWSaveScreen
  make: *** [all] Error 1

Now the dynamic loader complains about the missing symbol.  Next, we
turn off tail call optimization:

  $ make clean
  rm -f main driver.so submod.so
  $ make CC=~/Downloads/clang+llvm-3.2-x86_64-linux-ubuntu-12.04/bin/clang
CFLAGS_EXTRA=-fno-optimize-sibling-calls
  /home/dim/Downloads/clang+llvm-3.2-x86_64-linux-ubuntu-12.04/bin/clang -m32
-O2 -fPIC -DPIC -Wall -Wextra -fno-optimize-sibling-calls main.c -o main -ldl
  /home/dim/Downloads/clang+llvm-3.2-x86_64-linux-ubuntu-12.04/bin/clang -m32
-O2 -fPIC -DPIC -Wall -Wextra -fno-optimize-sibling-calls -shared driver.c -o
driver.so
  /home/dim/Downloads/clang+llvm-3.2-x86_64-linux-ubuntu-12.04/bin/clang -m32
-O2 -fPIC -DPIC -Wall -Wextra -fno-optimize-sibling-calls -shared submod.c -o
submod.so
  LD_LIBRARY_PATH=. ./main
  main: Loading ./driver.so...
  main: Searching function driver_init...
  main: Calling function driver_init...
  driver_init: Loading ./submod.so...
  vgaHWSaveScreen((nil), 42)...
  main: Result 84, unloading ./driver.so...

and then the driver loading actually works.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list