[LLVMdev] RFC: Tail call optimization X86

Tue Sep 25 11:14:11 PDT 2007

On Sep 25, 2007, at 12:01 AM, Arnold Schwaighofer wrote:

>>> FastCC use to be caller pops arguments so there was no stack
>>> adjustment after the
>>> call to qux. Now FastCC has callee pops arguments on return  
>>> semantics
>>> so the
>>> x86 backend inserts a stack adjustment after the call.
>>>
>>> _array:
>>>       subl    $12, %esp
>>>         movss   LCPI1_0, %xmm0
>>>         mulss   16(%esp), %xmm0
>>>         movss   %xmm0, (%esp)
>>>         call    L_qux$stub
>>>         subl    $4, %esp   << stack adjustment because qux pops
>>> arguments on return
>>>         mulss   LCPI1_0, %xmm0
>>>         addl    $12, %esp
>>>         ret     $4
>>
>> Ok. Please make sure for now this only takes effect if tail call
>> optimization is  turned on though. Also, we need to fold the subl  
>> into
>> addl if possible.
> I think you misunderstood me. I have not added code to insert this
> subl it is done already by the backend (before my changes) with
> certain calling conventions (callee pops arguments). There used to be
> already calling conventions e.g. x86_fastcall that would have cause
> the stack adjustment only fastcc was not one of them. Now that fastcc
> can cause tail call optimization i had to change the convention from
> caller pops arguments to callee pops arguments in order to allow tail
> call optimization in a general way.

Hmmm. Ok. So this is due to X86CallingConv.td changes? Unfortunately  
that's not controlled by options. Ok then.

>>
>>>
>>>
>>>> Index: include/llvm/Target/TargetLowering.h
>>>> ===================================================================
>>>> --- include/llvm/Target/TargetLowering.h     (revision 42247)
>>>> +++ include/llvm/Target/TargetLowering.h     (working copy)
>>>> @@ -851,8 +851,18 @@
>>>>    virtual std::pair<SDOperand, SDOperand>
>>>>    LowerCallTo(SDOperand Chain, const Type *RetTy, bool
>>>> RetTyIsSigned,
>>>>                bool isVarArg, unsigned CallingConv, bool  
>>>> isTailCall,
>>>> -              SDOperand Callee, ArgListTy &Args, SelectionDAG  
>>>> &DAG);
>>>> +              bool isNextInstRet, SDOperand Callee, ArgListTy  
>>>> &Args,
>>>> +              SelectionDAG &DAG);
>>>> +  // IsEligibleForTailCallElimination - Check whether the call is
>>>> eligible for
>>>> +  // tailcall elimination
>>>> +  virtual bool IsEligibleForTailCallElimination(SelectionDAG& DAG,
>>>> +                                                bool  
>>>> IsNextInstRet,
>>>> +                                                SDOperand Callee,
>>>> +                                                unsigned CalleeCC)
>>>> const {
>>>> +    return false;
>>>> +  }
>>>> +
>>>>
>>>> I am not too keen on "isNextInstRet". We should never use  
>>>> instruction
>>>> ordering before scheduling to determine whether it's possible to
>>>> perform an optimization. IsEligibleForTailCallElimination() should
>>>> determine the feasibility on its own, no?
>>>
>>> My assumption:
>>> Tail call optimization is performed if the instruction following the
>>> call is
>>> a return and the return uses the result of the call or is a void
>>> return.
>>
>> Ok. But using whether the next instruction is a return at this  
>> time is
>> too restrictive. It's possible there may be instructions that are
>> between the call and the return that do not interfere with tail call
>> optimization.
>>
>>>
>>>
>>> The problem is that at the point (in TargetLowering:LowerCallTo)  
>>> where
>>> IsEligibleForTailCallElimination is called the SDOperand node for  
>>> the
>>> following instructions (we are looking for a return) has not been
>>> build.
>>> So IsEligibleForTailCallElimination can't determine - using
>>> 'Callee'  -
>>> whether the instruction following the call is a return. That is the
>>> reason
>>> why i pass  the IsNextInstRet flag to TargetLowering::LowerCallTo.
>>
>> Right. But I think the right way to deal with this is to assume all
>> the tail calls are eligible for tail call optimizations at this  
>> point.
>> Then checks for eligibility and fix the call nodes after all the
>> instructions are lowered.
> I am not sure that i understand where this point would be. As I view
> it the only place would be in LegalizeDAG.cpp in

No, in SelectionDAGISel.cpp:BuildSelectionDAG(). After line 4610, the  
complete DAG is available. You can add a call here to look at the  
last call and the terminator to decide whether the call is really a  
candidate for tail call optimization. You can fix it up at this point.

Evan

>
> SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) {
> ...
>  case ISD::CALL:
>     // The only option for this is to custom lower it.
>     Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG);
>
> }
>
> would change to (pseudo code)
> SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) {
>  SDOperand Result = Op;
> ...
>  case ISD::CALL:
>     // The only option for this is to custom lower it.
>
>     Result =  
> modifyCallSoThatTailAttributeReallyIndicatesWhetherThisCallIsEligibleF 
> orTailCallOptimization(Result);
>
>     Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG);
>
> }
>
> because any point after that (e.g. scheduling) would be to late
> because the target specific loweringcode (e.g. LowerCALL ->
> LowerCCCall)
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev