[LLVMdev] Runtime linker issue wtih X11R6 on i386 with -O3 optimization

Marco Peereboom marco at peereboom.us
Tue Mar 20 10:50:41 PDT 2012


Hi everybody.  I have an odd issue that I'd like to get some advice on.
It is a bit of a long story so please bear with me.

X11R6 has a notion of modules so it basically compiles everything into
shared libraries and at start-of-day it loads libraries (modules) as
needed.  A side effect of that is that they require really lazy binding
because they do (can?) not enforce the load order.

The problem I am seeing is with any optimization higher than -O0 on the
following code:

void
uxa_check_poly_lines(DrawablePtr pDrawable, GCPtr pGC,
		     int mode, int npt, DDXPointPtr ppt)
{
	ScreenPtr screen = pDrawable->pScreen;

	UXA_FALLBACK(("to %p (%c), width %d, mode %d, count %d\n",
		      pDrawable, uxa_drawable_location(pDrawable),
		      pGC->lineWidth, mode, npt));

	if (pGC->lineWidth == 0) {
		if (uxa_prepare_access(pDrawable, UXA_ACCESS_RW)) {
			if (uxa_prepare_access_gc(pGC)) {
				fbPolyLine(pDrawable, pGC, mode, npt, ppt);
				uxa_finish_access_gc(pGC);
			}
			uxa_finish_access(pDrawable);
		}
		return;
	}
	/* fb calls mi functions in the lineWidth != 0 case. */
	fbPolyLine(pDrawable, pGC, mode, npt, ppt);
}

This code optimizes into a TAILCALL and that makes X unhappy.  Now to
make things worse, this exact same code works fine on X86_64, I only see
this issue on i386.  Admittedly I have not looked at the x86_64 asm to
look for differences.  All the code was compiled using clang 3.0 release
on OpenBSD.

Prototyping the offending functions with __attribute__((weak)) works
around the problem but is pretty ugly and unmaintainable in a project as old
and the size of xorg.  Is there a magic flag I can use to enforce this
behavior or can we consider this a bug of sorts.  I get why clang does
what it does, unfortunately it breaks stuff.

And I'll add the mandatory whine, yes it works with gcc at all
optimization levels.

I can provide more information if needed.

=======================================================================
-O3
# objdump -R intel_drv.so | grep PolyLine                         
20014754 R_386_GLOB_DAT    fbPolyLine
20014048 R_386_JUMP_SLOT   fbPolyLine

So the problem is that clang generates R_386_GLOB_DAT for fbPolyLine in
order to load the jump into %eax that then is jumped to from 5589f.  The
offending code is at 558a1.  I get the optimization and think it is
pretty cute however stuff like X11 relies on symbols being loaded really
really late :(

There is one extra confusing factor.  If I proto type the fbPolyLine
funtion with __attribute__((weak)) the same code *does* work.  I have
not dug into this at all but since I found it as an ugly workaround I
figured I'd mention it.

                        if (uxa_prepare_access_gc(pGC)) {
                                fbPolyLine(pDrawable, pGC, mode, npt,
ppt);
   55844:       89 44 24 10             mov    %eax,0x10(%esp)
   55848:       8b 45 14                mov    0x14(%ebp),%eax
   5584b:       89 44 24 0c             mov    %eax,0xc(%esp)
   5584f:       8b 45 10                mov    0x10(%ebp),%eax
   55852:       89 44 24 08             mov    %eax,0x8(%esp)
   55856:       89 7c 24 04             mov    %edi,0x4(%esp)
   5585a:       8b 45 08                mov    0x8(%ebp),%eax
   5585d:       89 04 24                mov    %eax,(%esp)
   55860:       89 f3                   mov    %esi,%ebx
   55862:       e8 05 45 fb ff          call   9d6c <_init+0x77c>
   55867:       b8 c0 00 00 00          mov    $0xc0,%eax
   5586c:       23 47 10                and    0x10(%edi),%eax
   5586f:       83 f8 40                cmp    $0x40,%eax
   55872:       75 0d                   jne    55881 <uxa_check_poly_lines+0x171>
   55874:       8b 47 20                mov    0x20(%edi),%eax
   55877:       89 04 24                mov    %eax,(%esp)
   5587a:       89 f3                   mov    %esi,%ebx
   5587c:       e8 bb 5a fb ff          call   b33c <_init+0x1d4c>
   55881:       8b 47 24                mov    0x24(%edi),%eax
   55884:       85 c0                   test   %eax,%eax
   55886:       74 0a                   je     55892 <uxa_check_poly_lines+0x182>
   55888:       89 04 24                mov    %eax,(%esp)
   5588b:       89 f3                   mov    %esi,%ebx
   5588d:       e8 aa 5a fb ff          call   b33c <_init+0x1d4c>
				uxa_finish_access_gc(pGC);
                        }
                        uxa_finish_access(pDrawable);
   55892:       8b 86 c4 09 00 00       mov    0x9c4(%esi),%eax
   55898:       83 c4 24                add    $0x24,%esp
   5589b:       5e                      pop    %esi
   5589c:       5f                      pop    %edi
   5589d:       5b                      pop    %ebx
   5589e:       5d                      pop    %ebp
   5589f:       ff e0                   jmp    *%eax
                }
                return;
        }
        /* fb calls mi functions in the lineWidth != 0 case. */
        fbPolyLine(pDrawable, pGC, mode, npt, ppt);
   558a1:       8b 86 f0 08 00 00       mov    0x8f0(%esi),%eax
   558a7:       eb ef                   jmp    55898 <uxa_check_poly_lines+0x188>


=======================================================================
-O0
# objdump -R intel_drv.so | grep PolyLine 
200143cc R_386_JUMP_SLOT   fbPolyLine

The relevant asm.  The juicy bits are at 8330d and 83358 which are the
calls to fbPolyLine.  Since it is always a direct call all is groovy.

                        if (uxa_prepare_access_gc(pGC)) {
   832cf:       8b 45 ec                mov    0xffffffec(%ebp),%eax
   832d2:       89 04 24                mov    %eax,(%esp)
   832d5:       8b 5d d8                mov    0xffffffd8(%ebp),%ebx
   832d8:       e8 1f 75 f8 ff          call   a7fc <_init+0x119c>
   832dd:       3d 00 00 00 00          cmp    $0x0,%eax
   832e2:       0f 84 38 00 00 00       je     83320 <uxa_check_poly_lines+0x150>
                                fbPolyLine(pDrawable, pGC, mode, npt,
ppt);
   832e8:       8b 45 f0                mov    0xfffffff0(%ebp),%eax
   832eb:       8b 4d ec                mov    0xffffffec(%ebp),%ecx
   832ee:       8b 55 e8                mov    0xffffffe8(%ebp),%edx
   832f1:       8b 75 e4                mov    0xffffffe4(%ebp),%esi
   832f4:       8b 7d e0                mov    0xffffffe0(%ebp),%edi
   832f7:       89 04 24                mov    %eax,(%esp)
   832fa:       89 4c 24 04             mov    %ecx,0x4(%esp)
   832fe:       89 54 24 08             mov    %edx,0x8(%esp)
   83302:       89 74 24 0c             mov    %esi,0xc(%esp)
   83306:       89 7c 24 10             mov    %edi,0x10(%esp)
   8330a:       8b 5d d8                mov    0xffffffd8(%ebp),%ebx
   8330d:       e8 fa 6a f8 ff          call   9e0c <_init+0x7ac>
				uxa_finish_access_gc(pGC);
   83312:       8b 45 ec                mov    0xffffffec(%ebp),%eax
   83315:       89 04 24                mov    %eax,(%esp)
   83318:       8b 5d d8                mov    0xffffffd8(%ebp),%ebx
   8331b:       e8 4c 73 f8 ff          call   a66c <_init+0x100c>
                        }
                        uxa_finish_access(pDrawable);
   83320:       8b 45 f0                mov    0xfffffff0(%ebp),%eax
   83323:       89 04 24                mov    %eax,(%esp)
   83326:       8b 5d d8                mov    0xffffffd8(%ebp),%ebx
   83329:       e8 9e 81 f8 ff          call   b4cc <_init+0x1e6c>
                }
                return;
   8332e:       e9 2a 00 00 00          jmp    8335d <uxa_check_poly_lines+0x18d>
        }
        /* fb calls mi functions in the lineWidth != 0 case. */
        fbPolyLine(pDrawable, pGC, mode, npt, ppt);
   83333:       8b 45 f0                mov    0xfffffff0(%ebp),%eax
   83336:       8b 4d ec                mov    0xffffffec(%ebp),%ecx
   83339:       8b 55 e8                mov    0xffffffe8(%ebp),%edx
   8333c:       8b 75 e4                mov    0xffffffe4(%ebp),%esi
   8333f:       8b 7d e0                mov    0xffffffe0(%ebp),%edi
   83342:       89 04 24                mov    %eax,(%esp)
   83345:       89 4c 24 04             mov    %ecx,0x4(%esp)
   83349:       89 54 24 08             mov    %edx,0x8(%esp)
   8334d:       89 74 24 0c             mov    %esi,0xc(%esp)
   83351:       89 7c 24 10             mov    %edi,0x10(%esp)
   83355:       8b 5d d8                mov    0xffffffd8(%ebp),%ebx
   83358:       e8 af 6a f8 ff          call   9e0c <_init+0x7ac>

========================================================================

xorg output:

$ startx 
xauth:  file /home/marco/.serverauth.12707 does not exist


X.Org X Server 1.11.4
Release Date: 2012-01-27
X Protocol Version 11, Revision 0
Build Operating System: OpenBSD 5.1 i386 
Current Operating System: OpenBSD i386.peereboom.us 5.1 GENERIC.MP#4 i386
Build Date: 20 March 2012  11:29:41AM
 
Current version of pixman: 0.24.4
        Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Tue Mar 20 12:42:07 2012
(==) Using system config directory "/usr/X11R6/share/X11/xorg.conf.d"
/usr/X11R6/bin/X:/usr/X11R6/lib/modules/drivers/intel_drv.so: undefined symbol 'fbPolyLine'
/usr/X11R6/bin/X:/usr/X11R6/lib/modules/drivers/intel_drv.so: undefined symbol 'vgaHWSaveScreen'
/usr/X11R6/bin/X:/usr/X11R6/lib/modules/drivers/intel_drv.so: undefined symbol 'fbPolySegment'
(EE) Failed to load /usr/X11R6/lib/modules/drivers/intel_drv.so: Cannot load specified object
(EE) Failed to load module "intel" (loader failed, 7)
/usr/X11R6/bin/X:/usr/X11R6/lib/modules/drivers/vesa_drv.so: undefined symbol 'shadowUpdatePacked'
(EE) Failed to load /usr/X11R6/lib/modules/drivers/vesa_drv.so: Cannot load specified object
(EE) Failed to load module "vesa" (loader failed, 7)
(EE) No drivers available.

Fatal server error:
no screens found

Please consult the The X.Org Foundation support 
         at http://wiki.x.org
 for help. 
Please also check the log file at "/var/log/Xorg.0.log" for additional information.

Server terminated with error (1). Closing log file.
xinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: server error



More information about the llvm-dev mailing list