[PATCH] Fix for PR15086

Rafael EspĂ­ndola rafael.espindola at gmail.com
Wed Nov 13 07:31:08 PST 2013


I *think* this is a fine improvement, but please add

* A comment saying why jmp foo at PLT is not valid or when it would be valid.
* A note in tailcallpic2.ll saying that it could be optimized to a
direct call (no got, no plt). We don't implement ELF's interposition
logic, but then pretend we do during codegen. This patch doesn't need
to fix that, but should note that we could do better.

Nick, do you agree?


On 13 November 2013 03:15, Alp Toker <alp at nuanti.com> wrote:
> This makes a lot of sense and matters to many users, I'd like to have
> this in 3.4.
>
> Guessing Nick held back from signing off on this at the last moment in
> case it disables tail calls causing a performance regression somewhere,
> but from a correctness point of view this seems necessary.
>
> So +1 on the ping and thanks for updating the patch Dimitry.
>
> PS. It _might_ be worth having a specific test for this with a comment
> explaining what's being tested, just so it doesn't regress again given
> that it was hard to detect in the wild.
>
> Alp.
>
>
> On 13/11/2013 07:50, Dimitry Andric wrote:
>> On 04 Nov 2013, at 22:27, Dimitry Andric <dimitry at andric.com> wrote:
>>> On 30 Oct 2013, at 22:39, Dimitry Andric <dimitry at andric.com> wrote:
>>>> On 21 Oct 2013, at 21:38, Dimitry Andric <dimitry at andric.com> wrote:
>>>>> On Oct 21, 2013, at 02:49, Nick Lewycky <nicholas at mxc.ca> wrote:
>>>>>> Dimitry Andric wrote:
>>>>>>> On Oct 20, 2013, at 23:48, Nick Lewycky<nicholas at mxc.ca>  wrote:
>>>>> ...
>>>>>>>> Also, what does this do to:
>>>>>>>>
>>>>>>>> extern int i;
>>>>>>>> int foo(void) { return i; }
>>>>>>>> int bar(void) { return foo(); }
>>>>>>>>
>>>>>>>> foo is externally visible but this can also be a tail call because we don't permit symbol interposition in cases where inlining would be legal.
>>>>>>> There does not seem to be any change in the produced code.  At -O0, the call to foo() in bar() gets done via the PLT, with or without my patch.  With any optimization, clang inlines foo() into bar(), and directly uses i at GOT to access the variable.  Since variables cannot be lazily linked, this is no problem.
>>>>>> Please try harder. :) Does __attribute__((noinline)) help? Alternatively, skip C and write it as straight .ll (be sure to include the 'tail' marker on the call).
>>>>> Ah yes, with noinline, foo() is considered a GlobalAddressSDNode, but it is not hidden or protected, so it is treated in the same way as an external call.  So then the jump gets changed into a call.
>>>>>
>>>>> I am not sure if there is a way to distinguish between a GlobalAddressSDNode that is "somewhere else" and a GlobalAddressSDNode that is local to the current compilation unit.  If so, we might turn the former into a call-via-PLT, and the latter into a jump-via-GOT.  Any idea?
>>>>>
>>>>> However, if you look at the PIC code generated for a tail jump on i386, it becomes:
>>>>>
>>>>> bar:                                    # @bar
>>>>> # BB#0:
>>>>>      pushl   %ebp
>>>>>      movl    %esp, %ebp
>>>>>      calll   .L1$pb
>>>>> .L1$pb:
>>>>>      popl    %eax
>>>>> .Ltmp2:
>>>>>      addl    $_GLOBAL_OFFSET_TABLE_+(.Ltmp2-.L1$pb), %eax
>>>>>      movl    foo at GOT(%eax), %eax
>>>>>      popl    %ebp
>>>>>      jmpl    *%eax                   # TAILCALL
>>>>>
>>>>> while the call-via-PLT version is:
>>>>>
>>>>> bar:                                    # @bar
>>>>> # BB#0:
>>>>>      pushl   %ebp
>>>>>      movl    %esp, %ebp
>>>>>      calll   foo at PLT
>>>>>      popl    %ebp
>>>>>      ret
>>>>>
>>>>> The latter just looks more efficient to me, or am I deluded? :-)
>>>> So, any thoughts or remarks on this?  Would the original patch (minus the small unrelated testcase fix that was already committed) be OK for applying?
>>>>
>>>> I would really like to get this fix in before 3.4 release. :-)
>>> Ping.  An updated patch, against trunk r194011 (with a fix for clang wanting more parentheses) is attached.  This passes all test cases, as far as I could check.
>> Ping^2 ...
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> --
> http://www.nuanti.com
> the browser experts
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list