[LLVMdev] Runtime linker issue wtih X11R6 on i386 with -O3 optimization
Marco Peereboom
marco at peereboom.us
Tue Mar 20 10:50:41 PDT 2012
Hi everybody. I have an odd issue that I'd like to get some advice on.
It is a bit of a long story so please bear with me.
X11R6 has a notion of modules so it basically compiles everything into
shared libraries and at start-of-day it loads libraries (modules) as
needed. A side effect of that is that they require really lazy binding
because they do (can?) not enforce the load order.
The problem I am seeing is with any optimization higher than -O0 on the
following code:
void
uxa_check_poly_lines(DrawablePtr pDrawable, GCPtr pGC,
int mode, int npt, DDXPointPtr ppt)
{
ScreenPtr screen = pDrawable->pScreen;
UXA_FALLBACK(("to %p (%c), width %d, mode %d, count %d\n",
pDrawable, uxa_drawable_location(pDrawable),
pGC->lineWidth, mode, npt));
if (pGC->lineWidth == 0) {
if (uxa_prepare_access(pDrawable, UXA_ACCESS_RW)) {
if (uxa_prepare_access_gc(pGC)) {
fbPolyLine(pDrawable, pGC, mode, npt, ppt);
uxa_finish_access_gc(pGC);
}
uxa_finish_access(pDrawable);
}
return;
}
/* fb calls mi functions in the lineWidth != 0 case. */
fbPolyLine(pDrawable, pGC, mode, npt, ppt);
}
This code optimizes into a TAILCALL and that makes X unhappy. Now to
make things worse, this exact same code works fine on X86_64, I only see
this issue on i386. Admittedly I have not looked at the x86_64 asm to
look for differences. All the code was compiled using clang 3.0 release
on OpenBSD.
Prototyping the offending functions with __attribute__((weak)) works
around the problem but is pretty ugly and unmaintainable in a project as old
and the size of xorg. Is there a magic flag I can use to enforce this
behavior or can we consider this a bug of sorts. I get why clang does
what it does, unfortunately it breaks stuff.
And I'll add the mandatory whine, yes it works with gcc at all
optimization levels.
I can provide more information if needed.
=======================================================================
-O3
# objdump -R intel_drv.so | grep PolyLine
20014754 R_386_GLOB_DAT fbPolyLine
20014048 R_386_JUMP_SLOT fbPolyLine
So the problem is that clang generates R_386_GLOB_DAT for fbPolyLine in
order to load the jump into %eax that then is jumped to from 5589f. The
offending code is at 558a1. I get the optimization and think it is
pretty cute however stuff like X11 relies on symbols being loaded really
really late :(
There is one extra confusing factor. If I proto type the fbPolyLine
funtion with __attribute__((weak)) the same code *does* work. I have
not dug into this at all but since I found it as an ugly workaround I
figured I'd mention it.
if (uxa_prepare_access_gc(pGC)) {
fbPolyLine(pDrawable, pGC, mode, npt,
ppt);
55844: 89 44 24 10 mov %eax,0x10(%esp)
55848: 8b 45 14 mov 0x14(%ebp),%eax
5584b: 89 44 24 0c mov %eax,0xc(%esp)
5584f: 8b 45 10 mov 0x10(%ebp),%eax
55852: 89 44 24 08 mov %eax,0x8(%esp)
55856: 89 7c 24 04 mov %edi,0x4(%esp)
5585a: 8b 45 08 mov 0x8(%ebp),%eax
5585d: 89 04 24 mov %eax,(%esp)
55860: 89 f3 mov %esi,%ebx
55862: e8 05 45 fb ff call 9d6c <_init+0x77c>
55867: b8 c0 00 00 00 mov $0xc0,%eax
5586c: 23 47 10 and 0x10(%edi),%eax
5586f: 83 f8 40 cmp $0x40,%eax
55872: 75 0d jne 55881 <uxa_check_poly_lines+0x171>
55874: 8b 47 20 mov 0x20(%edi),%eax
55877: 89 04 24 mov %eax,(%esp)
5587a: 89 f3 mov %esi,%ebx
5587c: e8 bb 5a fb ff call b33c <_init+0x1d4c>
55881: 8b 47 24 mov 0x24(%edi),%eax
55884: 85 c0 test %eax,%eax
55886: 74 0a je 55892 <uxa_check_poly_lines+0x182>
55888: 89 04 24 mov %eax,(%esp)
5588b: 89 f3 mov %esi,%ebx
5588d: e8 aa 5a fb ff call b33c <_init+0x1d4c>
uxa_finish_access_gc(pGC);
}
uxa_finish_access(pDrawable);
55892: 8b 86 c4 09 00 00 mov 0x9c4(%esi),%eax
55898: 83 c4 24 add $0x24,%esp
5589b: 5e pop %esi
5589c: 5f pop %edi
5589d: 5b pop %ebx
5589e: 5d pop %ebp
5589f: ff e0 jmp *%eax
}
return;
}
/* fb calls mi functions in the lineWidth != 0 case. */
fbPolyLine(pDrawable, pGC, mode, npt, ppt);
558a1: 8b 86 f0 08 00 00 mov 0x8f0(%esi),%eax
558a7: eb ef jmp 55898 <uxa_check_poly_lines+0x188>
=======================================================================
-O0
# objdump -R intel_drv.so | grep PolyLine
200143cc R_386_JUMP_SLOT fbPolyLine
The relevant asm. The juicy bits are at 8330d and 83358 which are the
calls to fbPolyLine. Since it is always a direct call all is groovy.
if (uxa_prepare_access_gc(pGC)) {
832cf: 8b 45 ec mov 0xffffffec(%ebp),%eax
832d2: 89 04 24 mov %eax,(%esp)
832d5: 8b 5d d8 mov 0xffffffd8(%ebp),%ebx
832d8: e8 1f 75 f8 ff call a7fc <_init+0x119c>
832dd: 3d 00 00 00 00 cmp $0x0,%eax
832e2: 0f 84 38 00 00 00 je 83320 <uxa_check_poly_lines+0x150>
fbPolyLine(pDrawable, pGC, mode, npt,
ppt);
832e8: 8b 45 f0 mov 0xfffffff0(%ebp),%eax
832eb: 8b 4d ec mov 0xffffffec(%ebp),%ecx
832ee: 8b 55 e8 mov 0xffffffe8(%ebp),%edx
832f1: 8b 75 e4 mov 0xffffffe4(%ebp),%esi
832f4: 8b 7d e0 mov 0xffffffe0(%ebp),%edi
832f7: 89 04 24 mov %eax,(%esp)
832fa: 89 4c 24 04 mov %ecx,0x4(%esp)
832fe: 89 54 24 08 mov %edx,0x8(%esp)
83302: 89 74 24 0c mov %esi,0xc(%esp)
83306: 89 7c 24 10 mov %edi,0x10(%esp)
8330a: 8b 5d d8 mov 0xffffffd8(%ebp),%ebx
8330d: e8 fa 6a f8 ff call 9e0c <_init+0x7ac>
uxa_finish_access_gc(pGC);
83312: 8b 45 ec mov 0xffffffec(%ebp),%eax
83315: 89 04 24 mov %eax,(%esp)
83318: 8b 5d d8 mov 0xffffffd8(%ebp),%ebx
8331b: e8 4c 73 f8 ff call a66c <_init+0x100c>
}
uxa_finish_access(pDrawable);
83320: 8b 45 f0 mov 0xfffffff0(%ebp),%eax
83323: 89 04 24 mov %eax,(%esp)
83326: 8b 5d d8 mov 0xffffffd8(%ebp),%ebx
83329: e8 9e 81 f8 ff call b4cc <_init+0x1e6c>
}
return;
8332e: e9 2a 00 00 00 jmp 8335d <uxa_check_poly_lines+0x18d>
}
/* fb calls mi functions in the lineWidth != 0 case. */
fbPolyLine(pDrawable, pGC, mode, npt, ppt);
83333: 8b 45 f0 mov 0xfffffff0(%ebp),%eax
83336: 8b 4d ec mov 0xffffffec(%ebp),%ecx
83339: 8b 55 e8 mov 0xffffffe8(%ebp),%edx
8333c: 8b 75 e4 mov 0xffffffe4(%ebp),%esi
8333f: 8b 7d e0 mov 0xffffffe0(%ebp),%edi
83342: 89 04 24 mov %eax,(%esp)
83345: 89 4c 24 04 mov %ecx,0x4(%esp)
83349: 89 54 24 08 mov %edx,0x8(%esp)
8334d: 89 74 24 0c mov %esi,0xc(%esp)
83351: 89 7c 24 10 mov %edi,0x10(%esp)
83355: 8b 5d d8 mov 0xffffffd8(%ebp),%ebx
83358: e8 af 6a f8 ff call 9e0c <_init+0x7ac>
========================================================================
xorg output:
$ startx
xauth: file /home/marco/.serverauth.12707 does not exist
X.Org X Server 1.11.4
Release Date: 2012-01-27
X Protocol Version 11, Revision 0
Build Operating System: OpenBSD 5.1 i386
Current Operating System: OpenBSD i386.peereboom.us 5.1 GENERIC.MP#4 i386
Build Date: 20 March 2012 11:29:41AM
Current version of pixman: 0.24.4
Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Tue Mar 20 12:42:07 2012
(==) Using system config directory "/usr/X11R6/share/X11/xorg.conf.d"
/usr/X11R6/bin/X:/usr/X11R6/lib/modules/drivers/intel_drv.so: undefined symbol 'fbPolyLine'
/usr/X11R6/bin/X:/usr/X11R6/lib/modules/drivers/intel_drv.so: undefined symbol 'vgaHWSaveScreen'
/usr/X11R6/bin/X:/usr/X11R6/lib/modules/drivers/intel_drv.so: undefined symbol 'fbPolySegment'
(EE) Failed to load /usr/X11R6/lib/modules/drivers/intel_drv.so: Cannot load specified object
(EE) Failed to load module "intel" (loader failed, 7)
/usr/X11R6/bin/X:/usr/X11R6/lib/modules/drivers/vesa_drv.so: undefined symbol 'shadowUpdatePacked'
(EE) Failed to load /usr/X11R6/lib/modules/drivers/vesa_drv.so: Cannot load specified object
(EE) Failed to load module "vesa" (loader failed, 7)
(EE) No drivers available.
Fatal server error:
no screens found
Please consult the The X.Org Foundation support
at http://wiki.x.org
for help.
Please also check the log file at "/var/log/Xorg.0.log" for additional information.
Server terminated with error (1). Closing log file.
xinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: server error
More information about the llvm-dev
mailing list