[PATCH] Fix for PR15086

Nick Lewycky nicholas at mxc.ca
Sun Oct 20 14:48:43 PDT 2013


Dimitry Andric wrote:
> Hi,
>
> There finally was a bit of movement again on http://llvm.org/PR15086 , which deals with tail call optimization in GOT PIC mode (used on e.g. Linux and FreeBSD on 32-bit x86 arches).
>
> The basic problem is that mainstream programs such as X.org cannot deal with the way clang optimizes tail calls, as in this example:
>
> int foo(void);
> int bar(void) {
>    return foo();
> }
>
> where the call is transformed to:
>
> 	calll	.L0$pb
> .L0$pb:
> 	popl	%eax
> .Ltmp0:
> 	addl	$_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
> 	movl	foo at GOT(%eax), %eax
> 	popl	%ebp
> 	jmpl	*%eax                   # TAILCALL
>
> However, the GOT references must all be resolved at dlopen() time, and so this approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which usually populates the PLT with stubs that perform the actual resolving.
>
> Therefore, I propose the attached fix, which changes X86TargetLowering::LowerCall() to skip tail call optimization, if the called function is a global or external symbol.  I have also updated the appropriate test cases.  If most people would also like to have a separate test case just for this specific PR, let me know and I will add one.

I agree with Tijl's assessment in the bug. I'd appreciate if you could 
add a testcase where the caller does have internal linkage and ensure 
that we do emit a tail call.

Also, what does this do to:

   extern int i;
   int foo(void) { return i; }
   int bar(void) { return foo(); }

foo is externally visible but this can also be a tail call because we 
don't permit symbol interposition in cases where inlining would be legal.

Please commit the change to 
test/ExecutionEngine/MCJIT/test-global-init-nonzero-sm-pic.ll now and 
pull it out of this patch.

Nick



More information about the llvm-commits mailing list