[llvm-dev] Function calls keep increasing the stack usage

palpar via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 14 12:20:32 PDT 2018


Thanks for checking, I suppose it may have been fixed then, I don't have
the latest version to try it now.
Curious what could have fixed it, because X86FastISel::fastLowerCall()
still has the calls to getRegForValue() (or maybe that's not the problem).

On Fri, Sep 14, 2018 at 10:02 PM David Blaikie <dblaikie at gmail.com> wrote:

> Still not seeing it on ToT, so maybe it's been fixed?
>
> $ clang-tot -cc1 -S -triple i386-pc-win32 stack.c
> ...
> _bar:
>         subl    $16, %esp
>         movl    $1, (%esp)
>         movl    $2, 4(%esp)
>         calll   _foo
>         movl    $3, (%esp)
>         movl    $4, 4(%esp)
>         movl    %eax, 12(%esp)
>         calll   _foo
>         movl    %eax, 8(%esp)
>         addl    $16, %esp
>         retl
>
> $ clang-tot --version
> clang version 8.0.0 (trunk 342200) (llvm/trunk 342202)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
>
>
>
>
> On Fri, Sep 14, 2018 at 11:57 AM palpar <palparni at gmail.com> wrote:
>
>> Sorry I missed that important detail. The relevant part of the command
>> line is:
>> -cc1 -S -triple i386-pc-win32
>> I don't expect it matters if it's for Windows or Linux in this case.
>>
>> On Fri, Sep 14, 2018 at 9:16 PM David Blaikie <dblaikie at gmail.com> wrote:
>>
>>> Can't say I've observed that behavior (though I'm just building from
>>> top-of-tree rather than 6.0, compiling for x86-64 on linux), perhaps you
>>> could provide more detail (what target are you compiling for - possibly
>>> provide the -cc1 command line, etc).
>>>
>>> bar:                                    # @bar
>>>         .cfi_startproc
>>> # %bb.0:                                # %entry
>>>         pushq   %rbp
>>>         .cfi_def_cfa_offset 16
>>>         .cfi_offset %rbp, -16
>>>         movq    %rsp, %rbp
>>>         .cfi_def_cfa_register %rbp
>>>         subq    $16, %rsp
>>>         movl    $1, %edi
>>>         movl    $2, %esi
>>>         callq   foo
>>>         movl    $3, %edi
>>>         movl    $4, %esi
>>>         movl    %eax, -4(%rbp)          # 4-byte Spill
>>>         callq   foo
>>>         movl    %eax, -8(%rbp)          # 4-byte Spill
>>>         addq    $16, %rsp
>>>         popq    %rbp
>>>         .cfi_def_cfa %rsp, 8
>>>         retq
>>>
>>>
>>> Or on 32-bit X86:
>>>
>>> bar:                                    # @bar
>>>         .cfi_startproc
>>> # %bb.0:                                # %entry
>>>         pushq   %rbp
>>>         .cfi_def_cfa_offset 16
>>>         .cfi_offset %rbp, -16
>>>         movq    %rsp, %rbp
>>>         .cfi_def_cfa_register %rbp
>>>         subq    $16, %rsp
>>>         movl    $1, %edi
>>>         movl    $2, %esi
>>>         callq   foo
>>>         movl    $3, %edi
>>>         movl    $4, %esi
>>>         movl    %eax, -4(%rbp)          # 4-byte Spill
>>>         callq   foo
>>>         movl    %eax, -8(%rbp)          # 4-byte Spill
>>>         addq    $16, %rsp
>>>         popq    %rbp
>>>         .cfi_def_cfa %rsp, 8
>>>         retq
>>>
>>>
>>> On Fri, Sep 14, 2018 at 8:16 AM palpar via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> I found that LLVM generates redundant code when calling functions with
>>>> constant parameters, with optimizations disabled.
>>>>
>>>> Consider the following C code snippet:
>>>>
>>>> int foo(int x, int y);
>>>>
>>>> void bar()
>>>> {
>>>> foo(1, 2);
>>>> foo(3, 4);
>>>> }
>>>>
>>>> Clang/LLVM 6.0 generates the following assembly code:
>>>> _bar:
>>>> subl $32, %esp
>>>> movl $1, %eax
>>>> movl $2, %ecx
>>>> movl $1, (%esp)
>>>> movl $2, 4(%esp)
>>>> movl %eax, 28(%esp)
>>>> movl %ecx, 24(%esp)
>>>> calll _foo
>>>> movl $3, %ecx
>>>> movl $4, %edx
>>>> movl $3, (%esp)
>>>> movl $4, 4(%esp)
>>>> movl %eax, 20(%esp)
>>>> movl %ecx, 16(%esp)
>>>> movl %edx, 12(%esp)
>>>> calll _foo
>>>> movl %eax, 8(%esp)
>>>> addl $32, %esp
>>>> retl
>>>>
>>>> Note how the constants are stored in registers but when saving the
>>>> parameters on the stack for the call the immediate values are used. The
>>>> registers are still stored on the stack probably because it's the caller's
>>>> responsibility once they were used (which seems expected).
>>>> I think the problem comes from the fact that LLVM unconditionally
>>>> allocates a register for each parameter value regardless if it's used later
>>>> or not.
>>>> If the stack space of the program is sufficiently large this is
>>>> probably not a problem, but otherwise if there is a large number of such
>>>> calls, despite not recursive, it can lead to stack overflow. Do you think I
>>>> should create a bug report for this?
>>>>
>>>> (Similarly, the return value of the function could be not saved but the
>>>> LLVM IR code that Clang generates has the call with assignment so at this
>>>> point LLVM couldn't possibly know.
>>>> define void @bar() #0 {
>>>>   %call = call i32 @foo(i32 1, i32 2)
>>>>   %call1 = call i32 @foo(i32 3, i32 4)
>>>>   ret void
>>>> }
>>>> )
>>>>
>>>> Thanks,
>>>> Alpar
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180914/dcdc3afb/attachment.html>


More information about the llvm-dev mailing list