[LLVMdev] RFC: Tail call optimization X86

Evan Cheng evan.cheng at apple.com
Tue Sep 11 10:26:39 PDT 2007


Hi Arnold,

Thanks for the patch. Some questions and commons:

1. Have you test it against the llvm test suite? Does it work if fp  
elimination optimization is turned off?
2. Please follow llvm coding convention and make sure every line fits  
in 80 columns.
3.
enum NameDecorationStyle {
    None,
    StdCall,
-  FastCall
+  FastCall,
+  FastCC // the normal fastcc calling convention
};

Why is FastCC necessary? Can't you just use FastCall?

4.
def X86tailcall: SDNode<"X86ISD::TAILCALL",     SDT_X86Call,
                          [SDNPHasChain, SDNPOutFlag, SDNPOptInFlag]>;
+def X86truetailcall: SDNode<"X86ISD::TRUETAILCALL",     SDT_X86Call,
+                        [SDNPHasChain, SDNPOutFlag, SDNPOptInFlag]>;
+

Please use X86tailcall. It's not currently used so feel free to  
change its patterns, etc.

5.
+// the following two instructions are used to adjust the stack pointer
+// in the case where the callee has more arguments than the caller
+// an area is created where the return addr can be safely moved to
+let isConvertibleToThreeAddress = 1 in {   // Can transform into LEA.
+def TCADD32ri  : Ii32<0x81, MRM0r, (outs GR32:$dst), (ins GR32: 
$src1, i32imm:$src2),
+                    "add{l}\t{$src2, $dst|$dst, $src2}",
+                    []>;
+}
+
+def TCSUB32ri  : Ii32<0x81, MRM5r, (outs GR32:$dst), (ins GR32: 
$src1, i32imm:$src2),
+                    "sub{l}\t{$src2, $dst|$dst, $src2}",
+                    []>;
+
+//let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1 in
+//def TCRETURNmi : I<0, Pseudo, (outs), (ins i32mem:$dst, i32imm: 
$offset),
+//                 "#TC_RETURN $dst $offset",
+//                 []>;
+

Why are these needed? They don't look any different from normal add  
and subtraction instructions. Why not just use ADJCALLSTACKDOWN and  
ADJCALLSTACKUP?

6.
+  } else if (RetOpcode ==  X86::TCRETURNdi) {
+    // a tailcall adjust the stack
...
+  } else if (RetOpcode == X86::TCRETURNri) {
+    MBBI = prior(MBB.end());

Seems like there are quite a bit of duplicate code between the 2  
cases. Please refactor.

7.
X86TargetLowering::X86TargetLowering(TargetMachine &TM)
    : TargetLowering(TM) {
+  IsLastCallTailCall = false;

This is probably not a good idea. You are assuming nodes are lowered  
in certain order, that is dangerous. It's probably better to check  
whether the last call is a tail call on the fly as you are processing  
the return node.

8.
-
-  SDOperand Chain = Op.getOperand(0);
-  SDOperand Flag;
-
-  // Copy the result values into the output registers.
-  if (RVLocs.size() != 1 || !RVLocs[0].isRegLoc() ||
-      RVLocs[0].getLocReg() != X86::ST0) {
-    for (unsigned i = 0; i != RVLocs.size(); ++i) {
-      CCValAssign &VA = RVLocs[i];
-      assert(VA.isRegLoc() && "Can only return in registers!");
-      Chain = DAG.getCopyToReg(Chain, VA.getLocReg(), Op.getOperand 
(i*2+1),
-                               Flag);
-      Flag = Chain.getValue(1);
-    }
+  if (IsLastCallTailCall) {
+    IsLastCallTailCall = false;
+    SDOperand TailCall = GetTailCall(Op);
+    SDOperand TargetAddress = TailCall.getOperand(1);
+    SDOperand StackAdjustment = TailCall.getOperand(2);
+    assert ( ((TargetAddress.getOpcode() == ISD::Register &&
+               cast<RegisterSDNode>(TargetAddress)->getReg() ==  
X86::ECX) ||
+              TargetAddress.getOpcode() == ISD::TargetExternalSymbol ||
+              TargetAddress.getOpcode() == ISD::TargetGlobalAddress) &&
+             "Expecting an global address, external symbol, or  
register");
+    assert( StackAdjustment.getOpcode() == ISD::Constant &&  
"Expecting a const value");
+    // TODO: should we add flag from tail call as last operand?
+    return DAG.getNode(X86ISD::TC_RETURN, MVT::Other, Op.getOperand 
(0), TargetAddress, StackAdjustment);
    } else {

Since if (IsLastCallTailCall) { ... } always ends with an early  
exist, you can just place the block before if (RVLocs.size() ...) and  
there is no need to change indentation for the rest of the function.

9.
+// Check to see whether the next instruction following the call is a  
return
+// A function is eligable if caller/callee calling conventions match  
and the
+// function CALL is immediatly followed by a RET
+bool X86TargetLowering::IsEligibleForTailCallElimination(SDOperand  
Call, SelectionDAG& DAG, unsigned CalleeCC, SDOperand Callee) {
+  bool IsEligible = false;
+  SDNode * CallNode = Call.Val;
...

+SDOperand X86TargetLowering::LowerX86_32FastCCCallTo(SDOperand Op,
+                                                     SelectionDAG &DAG,
+                                                     unsigned CC) {
+  DOUT << "LowerX86_32FastCCCallTo\n";
+  SDOperand Chain     = Op.getOperand(0);
+  bool isVarArg       = cast<ConstantSDNode>(Op.getOperand(2))- 
 >getValue() != 0;
+  bool isTailCall     = cast<ConstantSDNode>(Op.getOperand(3))- 
 >getValue() != 0;
+  SDOperand Callee    = Op.getOperand(4);
+  //unsigned NumOps     = (Op.getNumOperands() - 5) / 2;
+
+  // Analyze operands of the call, assigning locations to each operand.
+  SmallVector<CCValAssign, 16> ArgLocs;
+  CCState CCInfo(CC, isVarArg, getTargetMachine(), ArgLocs);
+  CCInfo.AnalyzeCallOperands(Op.Val, CC_X86_32_TailCall);
+  if (isTailCall &&
+      IsEligibleForTailCallElimination(Op, DAG,CC, Callee) &&
+      PerformTailCallOpt) {


IsEligibleForTailCallElimination() should be a target hook. This way  
TargetLowering::LowerCallTo() can determine whether a call is  
eligible for tail call elimination and insert the current ISD::CALL  
node. X86TargetLowering::LowerX86_32FastCCCallTo() will not have to  
handle non-tail calls.

Thanks,

Evan

On Sep 6, 2007, at 7:20 AM, Arnold Schwaighofer wrote:

> Hi Evan,
>
> first off thanks to you and Chris for taking time.
> On 6 Sep 2007, at 00:57, Evan Cheng wrote:
>
>> We'd like to see tail call optimization to be similar to the target
>> independent lowering of ISD::CALL nodes. These are auto-generated
>> from ???CallingConv.td files. Some target specific details such as
>> function address register (ECX in your example) should be coded in
>> these files. Ditto for stack offsets, etc. These should be part of  
>> the
>> fastcc calling convention. The tail call lowering will use these
>> information and target hooks to perform tail call optimization on
>> fastcc calls that are really tail calls.
>>
>> I think it's reasonable to start by doing a (clean) target specific
>> implementation for X86. Just understand we'd like to move to a target
>> independent mechanism so the earlier implementation may not survive.
>>
> Good that roughly sounds like what i did.
> (except probably for the clean bit :)
>
>
>> As for what you described here. I am having a hard time following
>> it. :-) Please send patch.
>>
>
> Okay here is a patch containing the tail call optimization code
> for X86. It mostly does what i have described in preceeding
> emails:
>
> There is new code for lowering fastcc calls:
> LowerX86_32FastCCCallTo
> LowerX86_32FastCCArguments
>
> There is some code checking whether a TAIL CALL really is
> eligible for tail call optimization.
>
> I modified:
> LowerRET
> to create a TC_RETURN node. (used for adjusting stackpointer and
> jumping to function in epilogue, similar to EH_RETURN)
>
> There is a  new calling convention:
> CC_X86_32_TailCall
> which is ~ CC_X86_32_C minus ECX.
>
> There are some modifications in
> X86InstrInfo.td
>
> There are new X86 DAG nodes:
> TRUETAILCALL
> TC_RETURN
>
>
> There is some voodoo in X86RegisterInfo.cpp to deal with the
> RETADDR problem described in X86ISelLowering.cpp comments.
> (callee has more args than caller -> there needs to be space
> between RETADDR stack slot and ebp/spilled registers for safely
> moving the RETADDR to)
>
> any comments, critique, suggestions welcome.
> regards arnold
>
> <tailcall10.patch>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list