<div dir="ltr"><div>Sorry, I forgot about this thread entirely. I generally don't approve non-trivial changes under lib/Target and lib/CodeGen, I'd rather somebody else with more assembly/object file/backend knowledge took a look.</div>
<div><br></div>On 13 November 2013 07:31, Rafael Espíndola <span dir="ltr"><<a href="mailto:rafael.espindola@gmail.com" target="_blank">rafael.espindola@gmail.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I *think* this is a fine improvement, but please add<br>
<br>
* A comment saying why jmp foo@PLT is not valid or when it would be valid.<br>
* A note in tailcallpic2.ll saying that it could be optimized to a<br>
direct call (no got, no plt). We don't implement ELF's interposition<br>
logic, but then pretend we do during codegen. This patch doesn't need<br>
to fix that, but should note that we could do better.<br>
<br>
Nick, do you agree?<br></blockquote><div><br></div><div>I'm a little confused. Either we should implement ELF interposition logic (and that means blocking inlining of interposable functions) or we shouldn't -- and I think we've decided previously that we shouldn't. If the issue is just that it's a missed optimization then that's fine.</div>
<div><br></div><div>Nick</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">
On 13 November 2013 03:15, Alp Toker <<a href="mailto:alp@nuanti.com">alp@nuanti.com</a>> wrote:<br>
> This makes a lot of sense and matters to many users, I'd like to have<br>
> this in 3.4.<br>
><br>
> Guessing Nick held back from signing off on this at the last moment in<br>
> case it disables tail calls causing a performance regression somewhere,<br>
> but from a correctness point of view this seems necessary.<br>
><br>
> So +1 on the ping and thanks for updating the patch Dimitry.<br>
><br>
> PS. It _might_ be worth having a specific test for this with a comment<br>
> explaining what's being tested, just so it doesn't regress again given<br>
> that it was hard to detect in the wild.<br>
><br>
> Alp.<br>
><br>
><br>
> On 13/11/2013 07:50, Dimitry Andric wrote:<br>
>> On 04 Nov 2013, at 22:27, Dimitry Andric <<a href="mailto:dimitry@andric.com">dimitry@andric.com</a>> wrote:<br>
>>> On 30 Oct 2013, at 22:39, Dimitry Andric <<a href="mailto:dimitry@andric.com">dimitry@andric.com</a>> wrote:<br>
>>>> On 21 Oct 2013, at 21:38, Dimitry Andric <<a href="mailto:dimitry@andric.com">dimitry@andric.com</a>> wrote:<br>
>>>>> On Oct 21, 2013, at 02:49, Nick Lewycky <<a href="mailto:nicholas@mxc.ca">nicholas@mxc.ca</a>> wrote:<br>
>>>>>> Dimitry Andric wrote:<br>
>>>>>>> On Oct 20, 2013, at 23:48, Nick Lewycky<<a href="mailto:nicholas@mxc.ca">nicholas@mxc.ca</a>> wrote:<br>
>>>>> ...<br>
>>>>>>>> Also, what does this do to:<br>
>>>>>>>><br>
>>>>>>>> extern int i;<br>
>>>>>>>> int foo(void) { return i; }<br>
>>>>>>>> int bar(void) { return foo(); }<br>
>>>>>>>><br>
>>>>>>>> foo is externally visible but this can also be a tail call because we don't permit symbol interposition in cases where inlining would be legal.<br>
>>>>>>> There does not seem to be any change in the produced code. At -O0, the call to foo() in bar() gets done via the PLT, with or without my patch. With any optimization, clang inlines foo() into bar(), and directly uses i@GOT to access the variable. Since variables cannot be lazily linked, this is no problem.<br>
>>>>>> Please try harder. :) Does __attribute__((noinline)) help? Alternatively, skip C and write it as straight .ll (be sure to include the 'tail' marker on the call).<br>
>>>>> Ah yes, with noinline, foo() is considered a GlobalAddressSDNode, but it is not hidden or protected, so it is treated in the same way as an external call. So then the jump gets changed into a call.<br>
>>>>><br>
>>>>> I am not sure if there is a way to distinguish between a GlobalAddressSDNode that is "somewhere else" and a GlobalAddressSDNode that is local to the current compilation unit. If so, we might turn the former into a call-via-PLT, and the latter into a jump-via-GOT. Any idea?<br>
>>>>><br>
>>>>> However, if you look at the PIC code generated for a tail jump on i386, it becomes:<br>
>>>>><br>
>>>>> bar: # @bar<br>
>>>>> # BB#0:<br>
>>>>> pushl %ebp<br>
>>>>> movl %esp, %ebp<br>
>>>>> calll .L1$pb<br>
>>>>> .L1$pb:<br>
>>>>> popl %eax<br>
>>>>> .Ltmp2:<br>
>>>>> addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp2-.L1$pb), %eax<br>
>>>>> movl foo@GOT(%eax), %eax<br>
>>>>> popl %ebp<br>
>>>>> jmpl *%eax # TAILCALL<br>
>>>>><br>
>>>>> while the call-via-PLT version is:<br>
>>>>><br>
>>>>> bar: # @bar<br>
>>>>> # BB#0:<br>
>>>>> pushl %ebp<br>
>>>>> movl %esp, %ebp<br>
>>>>> calll foo@PLT<br>
>>>>> popl %ebp<br>
>>>>> ret<br>
>>>>><br>
>>>>> The latter just looks more efficient to me, or am I deluded? :-)<br>
>>>> So, any thoughts or remarks on this? Would the original patch (minus the small unrelated testcase fix that was already committed) be OK for applying?<br>
>>>><br>
>>>> I would really like to get this fix in before 3.4 release. :-)<br>
>>> Ping. An updated patch, against trunk r194011 (with a fix for clang wanting more parentheses) is attached. This passes all test cases, as far as I could check.<br>
>> Ping^2 ...<br>
>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> llvm-commits mailing list<br>
>> <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
><br>
> --<br>
> <a href="http://www.nuanti.com" target="_blank">http://www.nuanti.com</a><br>
> the browser experts<br>
><br>
> _______________________________________________<br>
> llvm-commits mailing list<br>
> <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</div></div></blockquote></div></div></div>