[PATCH] Fix for PR15086

Dimitry Andric dimitry at andric.com
Mon Oct 21 12:38:22 PDT 2013


On Oct 21, 2013, at 02:49, Nick Lewycky <nicholas at mxc.ca> wrote:
> Dimitry Andric wrote:
>> On Oct 20, 2013, at 23:48, Nick Lewycky<nicholas at mxc.ca>  wrote:
...
>>> Also, what does this do to:
>>> 
>>>  extern int i;
>>>  int foo(void) { return i; }
>>>  int bar(void) { return foo(); }
>>> 
>>> foo is externally visible but this can also be a tail call because we don't permit symbol interposition in cases where inlining would be legal.
>> 
>> There does not seem to be any change in the produced code.  At -O0, the call to foo() in bar() gets done via the PLT, with or without my patch.  With any optimization, clang inlines foo() into bar(), and directly uses i at GOT to access the variable.  Since variables cannot be lazily linked, this is no problem.
> 
> Please try harder. :) Does __attribute__((noinline)) help? Alternatively, skip C and write it as straight .ll (be sure to include the 'tail' marker on the call).

Ah yes, with noinline, foo() is considered a GlobalAddressSDNode, but it is not hidden or protected, so it is treated in the same way as an external call.  So then the jump gets changed into a call.

I am not sure if there is a way to distinguish between a GlobalAddressSDNode that is "somewhere else" and a GlobalAddressSDNode that is local to the current compilation unit.  If so, we might turn the former into a call-via-PLT, and the latter into a jump-via-GOT.  Any idea?

However, if you look at the PIC code generated for a tail jump on i386, it becomes:

bar:                                    # @bar
# BB#0:
        pushl   %ebp
        movl    %esp, %ebp
        calll   .L1$pb
.L1$pb:
        popl    %eax
.Ltmp2:
        addl    $_GLOBAL_OFFSET_TABLE_+(.Ltmp2-.L1$pb), %eax
        movl    foo at GOT(%eax), %eax
        popl    %ebp
        jmpl    *%eax                   # TAILCALL

while the call-via-PLT version is:

bar:                                    # @bar
# BB#0:
        pushl   %ebp
        movl    %esp, %ebp
        calll   foo at PLT
        popl    %ebp
        ret

The latter just looks more efficient to me, or am I deluded? :-)

-Dimitry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131021/707596d2/attachment.sig>


More information about the llvm-commits mailing list