[llvm-commits] [PATCH] Remove tail marker when changing an argument to an alloca.

Wed Jan 2 06:21:35 PST 2013

Hi Rafael,

On 02/01/13 01:49, Rafael Espíndola wrote:
> On 1 January 2013 12:26, Duncan Sands <baldrick at free.fr> wrote:
>> Hi Rafael, if a call uses a 'byval' parameter, should it be marked as a tail
>> call in the first place?  After all, byval implicitly places a copy of the
>> passed argument on the stack, so the call is implicitly using the stack.
>> There is the question of who puts the copy on the stack: the caller or the
>> callee?  If the former, maybe there isn't a problem after all...
>
> That is an interesting point. On one hand, we cannot produce an tail
> call (as in, an actual jump) for
>
> %X = type { i32 }
> define void @g(%X* byval %b, %X* byval %a) {
> entry:
>    tail call void @f(%X* %a, %X* %b)
>    ret void
> }
> declare void @f(%X* byval, %X* byval)
>
>
> but it would be nice to remain the ability of producing one for
>
> %X = type { i32 }
> define void @g(%X* byval %a) {
> entry:
>    tail call void @f(%X* %a)
>    ret void
> }
> declare void @f(%X* byval)
>
> Since the language spec says the code generator "may optimize", I
> think we can interpret the "does not access any allocas or varargs in
> the caller" at the IL level.

on the other hand, the optimizers will turn recursive tail calls into a
loop.  I didn't check whether this optimization tests for byval, but it
would be a wrong to generate a loop when there are byval parameters in
general.

There may be other cases like this.  I think it would be wise to audit the
code base to see what deductions transforms make based on the tailcall flag
being set.  That should give a better idea of whether it is better to simply
disallow byval or not (or allow some kind of compromise).  Whatever the
conclusion is, it would be good to document it explicitly.

Ciao, Duncan.

  If, for a given function, the code
> generator has to produce stack space for a call, than it is free to
> not doing the optimization. It looks like that is what the x86 backend
> does for the above examples.