[llvm-dev] Tail call optimization is getting affected due to local function related optimization with IPRA

vivek pandya via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 28 09:53:25 PDT 2016


On Tue, Jun 28, 2016 at 8:11 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:

>
> On Jun 27, 2016, at 12:25 PM, vivek pandya <vivekvpandya at gmail.com> wrote:
>
> Hello ,
>
> To solve this bug locally I have given preference to tail call
> optimization over local function related optimization in IPRA. I have added
> following method to achieve this:
>
> bool isEligibleForTailCallOptimization(Function *F) {
>   CallingConv::ID CC = F->getCallingConv();
>   if (CC == CallingConv::Fast || CC == CallingConv::GHC || CC ==
> CallingConv::HiPE)
>     return true;
>   return false;
> }
>
> Any other suggestions are always welcomed.
>
>
> Why aren’t checking for the presence of a tail call?
>
Are you asking about if tail call optimization is enable or not?  If not
then above method is inspired from X86ISelLowering::canGuaranteeTCO().

-Vivek

>
>> Mehdi
>
>
> and I am checking this condition along with hasLocalLinkage() and
> hasAddressTaken().
>
> Due to this test-suite now has only 2 runtime failure where as before this
> there were around 43 due to local function related optimization. But of
> course by giving preference to tail call many opportunities to optimize
> IPRA is missed.
>
> The 2 existing failure are interesting and hard to debug, namely they are
> MultiSource/Applications/SPASS/SPASS and
> MultiSource/Applications/sqlite3/sqlite3
>
> However for test-suite sqlite3 is only complied as CLI program and thus it
> only contains 2 (big) source files. In sqlite3 source code there are few
> static methods with variable number of arguments and due to that these
> static function should not get tail call optimized and thus IPRA
> optimization can be done but I am digging more into this to know reason of
> failures.
>
> Sincerely,
> Vivek
>
>
> On Sun, Jun 26, 2016 at 10:05 PM, vivek pandya <vivekvpandya at gmail.com>
> wrote:
>
>> According to this
>> http://llvm.org/docs/CodeGenerator.html#tail-call-section, it seems that
>> adding a new CC for the purpose of local function optimization seems a good
>> idea because tail call optimization only takes place when both caller and
>> callee have fastcc or GHC or HiPE calling convention.
>>
>> -Vivek
>>
>>
>> On Sun, Jun 26, 2016 at 1:26 AM, vivek pandya <vivekvpandya at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Sat, Jun 25, 2016 at 11:03 PM, vivek pandya <vivekvpandya at gmail.com>
>>> wrote:
>>>
>>>> Hello LLVM Community,
>>>>
>>>> To improve Interprocedural Register Allocation (IPRA) we are trying to
>>>> force caller
>>>> saved registers for local functions (which has likage type local). To
>>>> achive it
>>>> I have modified TargetFrameLowering::determineCalleeSaves() to return
>>>> early for
>>>> function which satisfies if (F->hasLocalLinkage() &&
>>>> !F->hasAddressTaken()) and
>>>> also reflecting the fact that for local function there are no caller
>>>> saved registers
>>>> I am also changing RegUsageInfoCollector.cpp to not to mark regiseters
>>>> as callee
>>>> saved in RegMask due to CC with follwoing change in code:
>>>>
>>>> if (!F->hasLocalLinkage() || F->hasAddressTaken()) {
>>>>       const uint32_t *CallPreservedMask =
>>>>         TRI->getCallPreservedMask(MF,
>>>> MF.getFunction()->getCallingConv());
>>>>       // Set callee saved register as preserved.
>>>>       for (unsigned i = 0; i < RegMaskSize; ++i)
>>>>         RegMask[i] = RegMask[i] | CallPreservedMask[i];
>>>>     }
>>>>
>>>> For more details please follow following link.
>>>> https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/bYFMzppXEwAJ
>>>>
>>>> Now consider following bug due to forcing caller saved registers for
>>>> local function
>>>> when IPRA enable:
>>>>
>>>> void makewt(int nw, int *ip, double *w) {
>>>>   ...
>>>>   bitrv2(nw, ip, w);
>>>> }
>>>>
>>>> here bitrv2 is local fuction and for that when IPRA enable callee saved
>>>> registers
>>>> are set to none. So for that function following is set of collbered
>>>> register as
>>>> per regmaks collected by RegUsageInfoCollector pass.
>>>>
>>>> Function Name : bitrv2
>>>> Clobbered Registers:
>>>> AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI
>>>> ESP RAX
>>>> RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B
>>>> R9B R10B
>>>> R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W
>>>> R10W R11W
>>>> R12W R13W R14W R15W
>>>>
>>>> How ever caller of bitrv2, makewt has callee saved registers as per CC,
>>>> but this
>>>> code results in segmentation fault when compliled with O1 because
>>>> makewt has value
>>>> of *ip in R14 register and that is stored and restore by makewt at
>>>> begining of call
>>>> but due to tail call optimization following code is generated and here
>>>> bitrv2 does
>>>> not preserve R14 so whwn execution returns to main (which is caller of
>>>> makewt)
>>>> value of *ip is gone from R14 (which sould not) and when main calls
>>>> makewt again
>>>> then value of *ip (R14) is wrong and result into segmentation fault.
>>>>
>>>> Assembly code of makewt:
>>>>   _makewt:
>>>>   ...
>>>>   popq  %rbx
>>>>   popq  %r12
>>>>   popq  %r13
>>>>   popq  %r14
>>>>   popq  %r15
>>>>   popq  %rbp
>>>>   jmp _bitrv2                 ## TAILCALL
>>>>
>>>
>>> A very naive solution to this problem come to me is to convert above
>>> code to following:
>>>
>>>  _makewt:
>>>   ...
>>>   jmp _bitrv2                 ## TAILCALL
>>>   popq  %rbx
>>>   popq  %r12
>>>   popq  %r13
>>>   popq  %r14
>>>   popq  %r15
>>>   popq  %rbp
>>>
>>> So that when _bitrv2 returns caller will over write callee saved
>>> register ( as per CC of that function ) to correct values.
>>> I wanted to try it out but I am not able to find correct code where I
>>> can do that.
>>> -Vivek
>>>
>>>>
>>>> There is one more case of faluire due to local function related
>>>> optimization.
>>>> I am analysing that (sorry for taking more time but I am not much good
>>>> at assembly).
>>>>
>>>> I need some hints for how to solve this. If you feel some problem with
>>>> my analyses
>>>> please let me know if you want me to send generated .s file and source
>>>> .c file.
>>>>
>>>> Sincerely,
>>>> Vivek
>>>>
>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160628/9db8f754/attachment-0001.html>


More information about the llvm-dev mailing list