[PATCH] Fix for PR15086
Nick Lewycky
nicholas at mxc.ca
Sun Oct 20 14:48:43 PDT 2013
Dimitry Andric wrote:
> Hi,
>
> There finally was a bit of movement again on http://llvm.org/PR15086 , which deals with tail call optimization in GOT PIC mode (used on e.g. Linux and FreeBSD on 32-bit x86 arches).
>
> The basic problem is that mainstream programs such as X.org cannot deal with the way clang optimizes tail calls, as in this example:
>
> int foo(void);
> int bar(void) {
> return foo();
> }
>
> where the call is transformed to:
>
> calll .L0$pb
> .L0$pb:
> popl %eax
> .Ltmp0:
> addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
> movl foo at GOT(%eax), %eax
> popl %ebp
> jmpl *%eax # TAILCALL
>
> However, the GOT references must all be resolved at dlopen() time, and so this approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which usually populates the PLT with stubs that perform the actual resolving.
>
> Therefore, I propose the attached fix, which changes X86TargetLowering::LowerCall() to skip tail call optimization, if the called function is a global or external symbol. I have also updated the appropriate test cases. If most people would also like to have a separate test case just for this specific PR, let me know and I will add one.
I agree with Tijl's assessment in the bug. I'd appreciate if you could
add a testcase where the caller does have internal linkage and ensure
that we do emit a tail call.
Also, what does this do to:
extern int i;
int foo(void) { return i; }
int bar(void) { return foo(); }
foo is externally visible but this can also be a tail call because we
don't permit symbol interposition in cases where inlining would be legal.
Please commit the change to
test/ExecutionEngine/MCJIT/test-global-init-nonzero-sm-pic.ll now and
pull it out of this patch.
Nick
More information about the llvm-commits
mailing list