[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

Eli Friedman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jul 8 12:08:24 PDT 2020


efriedma added inline comments.


================
Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:94
 /// If it contains any dynamic allocas, returns false.
 static bool canTRE(Function &F) {
   // Because of PR962, we don't TRE dynamic allocas.
----------------
avl wrote:
> efriedma wrote:
> > If we're not going to try to do TRE at all on calls not marked "tail", we can probably drop this check.
> It looks to me that original idea(PR962) was to avoid inefficient code which is generated for dynamic alloca. 
> 
> Currently there would still be generated inefficient code:
> 
> Doing TRE for dynamic alloca requires correct stack adjustment to avoid extra stack usage. 
> i.e. dynamic stack reservation done for alloca should be restored 
> in the end of the current iteration. Current TRE implementation does not do this.
> 
> Please, consider the test case:
> 
> 
> ```
> #include <alloca.h>
> 
> int count;
> __attribute__((noinline)) void globalIncrement(const int* param) { 
> 	count += *param; 
> }
> 
> void test(int recurseCount)
> {
>     if (recurseCount == 0) return;
>     {
>     int *temp = (int*)alloca(100);
>     globalIncrement(temp);
>     }
>     test(recurseCount - 1);
> }
> 
> 
> ```
> Following is the x86 asm generated for the above test case in assumption that dynamic allocas are possible:
> 
> ```
> 
> .LBB1_2:
>         movq    %rsp, %rdi
>         addq    $-112, %rdi   <<<<<<<<<<<<<< dynamic stack reservation, need to be restored before "jne .LBB1_2"
>         movq    %rdi, %rsp
>         callq   _Z15globalIncrementPKi
>         addl    $-1, %ebx
>         jne     .LBB1_2
> ```
> 
> So, it looks like we still have inefficient code here and it was a reason for avoiding TRE.
I guess we can leave this for a later patch.

This isn't really any worse than the stack usage before TRE, assuming we can't emit a sibling call in the backend.  And we could avoid this by making TRE insert stacksave/stackrestore intrinsics.  But better to do one thing at a time.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085





More information about the cfe-commits mailing list