[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

Wed Jul 8 08:59:03 PDT 2020

avl marked 3 inline comments as done.
avl added inline comments.

================
Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:94
 /// If it contains any dynamic allocas, returns false.
 static bool canTRE(Function &F) {
   // Because of PR962, we don't TRE dynamic allocas.
----------------
efriedma wrote:
> If we're not going to try to do TRE at all on calls not marked "tail", we can probably drop this check.
It looks to me that original idea(PR962) was to avoid inefficient code which is generated for dynamic alloca. 

Currently there would still be generated inefficient code:

Doing TRE for dynamic alloca requires correct stack adjustment to avoid extra stack usage. 
i.e. dynamic stack reservation done for alloca should be restored 
in the end of the current iteration. Current TRE implementation does not do this.

Please, consider the test case:

```
#include <alloca.h>

int count;
__attribute__((noinline)) void globalIncrement(const int* param) { 
	count += *param; 
}

void test(int recurseCount)
{
    if (recurseCount == 0) return;
    {
    int *temp = (int*)alloca(100);
    globalIncrement(temp);
    }
    test(recurseCount - 1);
}

```
Following is the x86 asm generated for the above test case in assumption that dynamic allocas are possible:

```

.LBB1_2:
        movq    %rsp, %rdi
        addq    $-112, %rdi   <<<<<<<<<<<<<< dynamic stack reservation, need to be restored before "jne .LBB1_2"
        movq    %rdi, %rsp
        callq   _Z15globalIncrementPKi
        addl    $-1, %ebx
        jne     .LBB1_2
```

So, it looks like we still have inefficient code here and it was a reason for avoiding TRE.

================
Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:808
   // Until this is resolved, disable this transformation if that would ever
   // happen.  This bug is PR962.
   for (Function::iterator BBI = F.begin(), E = F.end(); BBI != E; /*in loop*/) {
----------------
efriedma wrote:
> Can you move this FIXME into a more appropriate spot?
OK.

================
Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:335
+        II->getIntrinsicID() == Intrinsic::assume)
+      return true;
+
----------------
efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > What is the new handling for lifetime.end/assume doing?
> > They are just skipped. In following test case:
> > 
> > 
> > ```
> >   call void @_Z5test5i(i32 %sub)
> >   call void @llvm.lifetime.end.p0i8(i64 24, i8* nonnull %1) #5
> >   call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #5
> >   br label %return
> > 
> > ```
> > 
> > they are generated in between call and ret. It is safe to ignore them while checking whether transformation is possible.
> It makes sense we can ignore lifetime.end on an alloca: we know the call doesn't refer to the alloca.  (Maybe we should check that the pointer argument is pointing at an alloca?  That should usually be true anyway, but better to be on the safe side, I guess.)
> 
> I don't think it's safe to hoist assume without additional checks; I think we'd need to check that the call is marked "willreturn"?
> 
> Since this is sort of tricky, I'd prefer to split this off into a followup.
>It makes sense we can ignore lifetime.end on an alloca: we know the call doesn't refer to the alloca. (Maybe we should check that the pointer argument is pointing at an alloca? That should usually be true anyway, but better to be on the safe side, I guess.)

OK, I would add checking that the pointer argument of lifetime.end is pointing to an alloca.

>I don't think it's safe to hoist assume without additional checks; I think we'd need to check that the call is marked "willreturn"?

>Since this is sort of tricky, I'd prefer to split this off into a followup.

Ok, I would split Intrinsic::assume into another review.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085