[PATCH] D66278: [RISCV] Enable tail call opt for variadic function

Tsung Chun Lin via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 29 01:20:39 PDT 2019


In your example:

__attribute__((noinline))
int print_scaled(unsigned long n, int scale){
  return printf("%lu.%lu", n/scale, n%scale);
}

This call printf with the varargs can be optimized as tail call by RISC-V
gcc.
Since it doesn't have any parameters passed via the stack from print_scaled
to printf.
Don't need to create stack frame for passing arguments. So it can just tail
call printf and
doesn't need to return back print_scaled for freeing the stack.

If a called function with varargs has more than eight arguments, it is not
allowed to do
tail call opt. Because some of arguments are passed by the stack.

It can focus on whether the stack frame is created and not yet freed before
the function call
for saving saved register or passing parameters or others.

Bruce Hoult <brucehoult at sifive.com> 於 2019年8月29日 週四 下午2:23寫道:

> To be concrete, you're talking about whether the function being called
> is varargs, not the function doing the calling? For example:
>
> __attribute__((noinline))
> int print_scaled(unsigned long n, int scale){
>   return printf("%lu.%lu", n/scale, n%scale);
> }
>
> x86_64 gcc tail calls printf at -O2 or -Os. So does RISC-V gcc. Here -Os:
>
> 00000000000101b0 <print_scaled>:
>    101b0:       02b57633                remu    a2,a0,a1
>    101b4:       02b555b3                divu    a1,a0,a1
>    101b8:       0001a537                lui     a0,0x1a
>    101bc:       93050513                addi    a0,a0,-1744 # 19930
> <__clzdi2+0x36>
>    101c0:       19c0006f                j       1035c <printf>
>
> Changing the function to...
>
> int print_scaled(unsigned long n, int scale){
>   return printf("%lu.%lu %lu.%lu %lu.%lu %lu.%lu ",
>                 n/scale, n%scale, n/scale, n%scale, n/scale, n%scale,
> n/scale, n%scale);
> }
>
> ... prevents tail calling on RISC-V because with nine arguments the
> last n%scale goes on the stack. On x86_64 the last four arguments are
> pushed.
>
> Eliminating one pair enables tail calling on RISC-V, but x86_64 still
> spills to the stack. The x86 tail-calls with two copies (five
> arguments total), which is its maximum in registers.
>
> It gets trickier if the calling function creates a stack frame, for
> example because it calls some other function(s) as well, or simply has
> too many live local variables. In this case the arguments for the
> printf need to be set up, then ra and any s registers reloaded and the
> stack popped, before the tail call.
>
> __attribute__((noinline))
> int power10(int n){
>   return n == 0 ? 1 : 10 * power10(n - 1);
> }
>
> __attribute__((noinline))
> int print_scaled(unsigned long n, int digits){
>   int scale = power10(digits);
>   return printf("%lu.%lu", n/scale, n%scale);
> }
>
> 00000000000101c0 <print_scaled>:
>    101c0:       1141                    addi    sp,sp,-16
>    101c2:       e022                    sd      s0,0(sp)
>    101c4:       842a                    mv      s0,a0
>    101c6:       852e                    mv      a0,a1
>    101c8:       e406                    sd      ra,8(sp)
>    101ca:       fe5ff0ef                jal     ra,101ae <power10>
>    101ce:       02a47633                remu    a2,s0,a0
>    101d2:       60a2                    ld      ra,8(sp)
>    101d4:       02a455b3                divu    a1,s0,a0
>    101d8:       6402                    ld      s0,0(sp)
>    101da:       0001a537                lui     a0,0x1a
>    101de:       95050513                addi    a0,a0,-1712 # 19950
> <__clzdi2+0x32>
>    101e2:       0141                    addi    sp,sp,16
>    101e4:       19c0006f                j       10380 <printf>
>
> On Wed, Aug 28, 2019 at 8:31 PM Jim Lin via Phabricator
> <reviews at reviews.llvm.org> wrote:
> >
> > Jim added a comment.
> >
> > @lenary
> > If any arguments are passed by the stack, it is not allowed to do
> tail-call-opt.
> > Because the caller would allocate the stack for passing the arguments,
> and need to
> > free the stack after the call finished (the call must return back for
> free the stack).
> >
> > The only difference on passing the varargs is that 2xXLen argument need
> to
> > be assigned an 'even' or 'aligned' register (8-byte alignment for RV32
> or 16-byte alignment for RV64).
> > So the function with varargs is allowed to be tail-call-optimised if no
> arguments are passed via the stack.
> >
> >
> > Repository:
> >   rL LLVM
> >
> > CHANGES SINCE LAST ACTION
> >   https://reviews.llvm.org/D66278/new/
> >
> > https://reviews.llvm.org/D66278
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190829/9e630d37/attachment-0001.html>


More information about the llvm-commits mailing list