[LLVMdev] Broke my tail (call)

Mon Feb 23 08:14:57 PST 2009

On Mon, Feb 23, 2009 at 2:13 PM, Jon Harrop <jon at ffconsultancy.com> wrote:
> Moreover, I now have evidence that LLVM is not behaving as you expect:
>
> 3. Adjusting the return value from this function into sret form results in
> tail call elimination being performed correctly. Note that this is still
> passing a first-class struct by value as an argument to a function through a
> tail call.
Yes i was wrong the problem is of a different kind and has nothing to
do with the alloca. As you say the struct is passed by value. This is
not the problem. For example the following code (not containing any
allocas) will also not be tail call optimized.

define fastcc { { i8*, i8* }*, i8*} @init({ { i8*, i8* }*, i8*}, i32) {
entry:
       %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*
}*, i8*} %0, i32 %1)
       ret { { i8*, i8* }*, i8*} %2
}

The problem has to do with how struct returns are represented
internally by llvm (in the SelectionDAG) and how the tail call
optimization implementation checks if it may perform the tail call.
The implementation checks that the <call> node is immediately followed
by a <ret> node. A struct return causes the insertion of a
<merge_values> node between the tail call instruction and the return
instruction node. Because of the intermediate node the backend
believes it must not optimize the tail call.

[result1, result2] = <call ... >
[merged] = <merge_values [result1, result2]>
<ret [merged]>

So the current situation is that tail calls on functions that return a
struct are not optimized (as you correctly observed ;). Sorry.

regards arnold