[PATCH] Fix for PR15086

Dimitry Andric dimitry at andric.com
Sun Oct 20 15:17:08 PDT 2013


On Oct 20, 2013, at 23:48, Nick Lewycky <nicholas at mxc.ca> wrote:
> Dimitry Andric wrote:
...
>> Therefore, I propose the attached fix, which changes X86TargetLowering::LowerCall() to skip tail call optimization, if the called function is a global or external symbol.  I have also updated the appropriate test cases.  If most people would also like to have a separate test case just for this specific PR, let me know and I will add one.
> 
> I agree with Tijl's assessment in the bug. I'd appreciate if you could add a testcase where the caller does have internal linkage and ensure that we do emit a tail call.

There is already such a testcase, test/CodeGen/X86/tailcallpic1.ll, which has:

define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) {
entry:
        ret i32 %a3
}

define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
entry:
        %tmp11 = tail call fastcc i32 @tailcallee( i32 %in1, i32 %in2, i32 %in1, i32 %in2 )             ; <i32> [#uses=1]
        ret i32 %tmp11
; CHECK: jmp tailcallee
}

The output of that testcase (it generates an asm line with "jmp tailcallee # TAILCALL") is not changed.  The testcase after that, test/CodeGen/X86/tailcallpic2.ll, does change.  This originally has:

; RUN: llc < %s  -tailcallopt -mtriple=i686-pc-linux-gnu -relocation-model=pic | FileCheck %s

define fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) {
entry:
	ret i32 %a3
}

define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
entry:
	%tmp11 = tail call fastcc i32 @tailcallee( i32 %in1, i32 %in2, i32 %in1, i32 %in2 )		; <i32> [#uses=1]
	ret i32 %tmp11
; CHECK: movl tailcallee at GOT
; CHECK: jmpl
}

but since tailcallee is now an external function, it gets called via the PLT instead of jumped to.  I am not entirely sure if it would always be safe to use a jump here...


> Also, what does this do to:
> 
>  extern int i;
>  int foo(void) { return i; }
>  int bar(void) { return foo(); }
> 
> foo is externally visible but this can also be a tail call because we don't permit symbol interposition in cases where inlining would be legal.

There does not seem to be any change in the produced code.  At -O0, the call to foo() in bar() gets done via the PLT, with or without my patch.  With any optimization, clang inlines foo() into bar(), and directly uses i at GOT to access the variable.  Since variables cannot be lazily linked, this is no problem.


> Please commit the change to test/ExecutionEngine/MCJIT/test-global-init-nonzero-sm-pic.ll now and pull it out of this patch.

Ah sorry, this was part of my local changes to make the test suite pass successfully.  Since I do not have commit rights, can you please commit it for me?

-Dimitry


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131021/9f33568e/attachment.sig>


More information about the llvm-commits mailing list