<div dir="ltr">Some frontends for LLVM require that LLVM perform tail call optimization (TCO) for correctness. Â Internally, LLVM refers to TCO of a non-recursive tail call as sibling call optimization, but I'm going to refer to that generically as TCO. Â Often, functional languages like Scheme have a language-level requirement that TCO occurs for any call in the tail position, and this is usually why users of LLVM have asked for guaranteed TCO functionality.<div>
<br></div><div>In my case, to implement vtable thunks and virtual member pointers in the IA32 Microsoft C++ ABI, I cannot simply forward the arguments to the callee without calling the copy constructor. Â If I can use a guaranteed tail call, I don't have to emit copy constructor calls, and things are much easier.<div>
<br></div><div>Currently, in order to get guaranteed TCO, frontends have to enable the GuaranteedTailCall codegen option and obey a narrow set of rules, which includes always using fastcc. Â This is fairly awkward and doesn't solve my use case, since the ABI requires a particular convention.<div>
<br></div><div>Instead, I propose that we add a new tail call marker, 'musttail', that guarantees that TCO will occur. Â I'm open to other naming suggestions. Â Some strawmen are 'tailonly' or 'guaranteedtail'. Â Along with it, I propose a set conservative of verifier enforced rules to follow to ensure that most reasonable backends will be able to perform TCO. Â It also ensures that optimizations, like tail merging, don't accidentally move a call out of tail position.</div>
<div><br></div><div>First, the prototype of the caller and the callee must be "congruent". Â Two prototypes are congruent if all return and parameter types match except for the pointer types. Â The pointer types can have different pointee types, but they must have the same address space. Â In addition, all the ABI impacting attributes on the parameters must be the same, meaning byval, inalloca, inreg, etc, must all match up.</div>
</div></div><div><br></div><div>Second, the call must be in tail position. Â The call must be immediately followed by a bitcast or a ret, both of which must use the result of the call. Â If there is a bitcast, it must be followed by a ret which uses the bitcast.</div>
<div><br></div><div>Importantly, LLVM must perform TCO regardless of the computation necessary to compute the arguments for the callee, which is not the case today.</div><div><br></div><div>I sent a patch to llvm-commits, but I'd like to hear high-level feedback on llvmdev:</div>
<div><a href="http://llvm-reviews.chandlerc.com/D3240">http://llvm-reviews.chandlerc.com/D3240</a><br></div><div><br></div><div>Thanks!</div></div>