[PATCH] Fix for PR15086
Dimitry Andric
dimitry at andric.com
Mon Nov 4 13:27:04 PST 2013
On 30 Oct 2013, at 22:39, Dimitry Andric <dimitry at andric.com> wrote:
> On 21 Oct 2013, at 21:38, Dimitry Andric <dimitry at andric.com> wrote:
>> On Oct 21, 2013, at 02:49, Nick Lewycky <nicholas at mxc.ca> wrote:
>>> Dimitry Andric wrote:
>>>> On Oct 20, 2013, at 23:48, Nick Lewycky<nicholas at mxc.ca> wrote:
>> ...
>>>>> Also, what does this do to:
>>>>>
>>>>> extern int i;
>>>>> int foo(void) { return i; }
>>>>> int bar(void) { return foo(); }
>>>>>
>>>>> foo is externally visible but this can also be a tail call because we don't permit symbol interposition in cases where inlining would be legal.
>>>>
>>>> There does not seem to be any change in the produced code. At -O0, the call to foo() in bar() gets done via the PLT, with or without my patch. With any optimization, clang inlines foo() into bar(), and directly uses i at GOT to access the variable. Since variables cannot be lazily linked, this is no problem.
>>>
>>> Please try harder. :) Does __attribute__((noinline)) help? Alternatively, skip C and write it as straight .ll (be sure to include the 'tail' marker on the call).
>>
>> Ah yes, with noinline, foo() is considered a GlobalAddressSDNode, but it is not hidden or protected, so it is treated in the same way as an external call. So then the jump gets changed into a call.
>>
>> I am not sure if there is a way to distinguish between a GlobalAddressSDNode that is "somewhere else" and a GlobalAddressSDNode that is local to the current compilation unit. If so, we might turn the former into a call-via-PLT, and the latter into a jump-via-GOT. Any idea?
>>
>> However, if you look at the PIC code generated for a tail jump on i386, it becomes:
>>
>> bar: # @bar
>> # BB#0:
>> pushl %ebp
>> movl %esp, %ebp
>> calll .L1$pb
>> .L1$pb:
>> popl %eax
>> .Ltmp2:
>> addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp2-.L1$pb), %eax
>> movl foo at GOT(%eax), %eax
>> popl %ebp
>> jmpl *%eax # TAILCALL
>>
>> while the call-via-PLT version is:
>>
>> bar: # @bar
>> # BB#0:
>> pushl %ebp
>> movl %esp, %ebp
>> calll foo at PLT
>> popl %ebp
>> ret
>>
>> The latter just looks more efficient to me, or am I deluded? :-)
>
> So, any thoughts or remarks on this? Would the original patch (minus the small unrelated testcase fix that was already committed) be OK for applying?
>
> I would really like to get this fix in before 3.4 release. :-)
Ping. An updated patch, against trunk r194011 (with a fix for clang wanting more parentheses) is attached. This passes all test cases, as far as I could check.
-Dimitry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix-pr15086-2.diff
Type: application/octet-stream
Size: 3114 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131104/c7388434/attachment.obj>
-------------- next part --------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131104/c7388434/attachment.sig>
More information about the llvm-commits
mailing list