[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

Eli Friedman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jun 23 13:26:49 PDT 2020


efriedma added inline comments.


================
Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {
----------------
avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > avl wrote:
> > > > > efriedma wrote:
> > > > > > avl wrote:
> > > > > > > efriedma wrote:
> > > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is it not enough that the call you're trying to TRE is marked "tail"?
> > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > 
> > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could be reused here. 
> > > > > > > But marking, done in markTails(), looks like separate tasks. i.e. it is better 
> > > > > > > to make TRE not depending on markTails(). There is a review for this - https://reviews.llvm.org/D60031
> > > > > > > Thus such separation looks useful(To not reuse result of markTails but have it computed inplace).
> > > > > > > 
> > > > > > > > Is it not enough that the call you're trying to TRE is marked "tail"?
> > > > > > > 
> > > > > > > It is not enough that call which is subject to TRE is marked "Tail".
> > > > > > > It also should be checked that other calls does not capture pointer to local stack: 
> > > > > > > 
> > > > > > > ```
> > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > >    return false;
> > > > > > > 
> > > > > > > ```
> > > > > > > 
> > > > > > > It is not enough that call which is subject to TRE is marked "Tail". It also should be checked that other calls does not capture pointer to local stack:
> > > > > > 
> > > > > > If there's an escaped pointer to the local stack, we wouldn't infer "tail" in the first place, would we?
> > > > > If function receives pointer to alloca then it would not be marked with "Tail". Then we do not have a possibility to understand whether this function receives pointer to alloca but does not capture it:
> > > > > 
> > > > > ```
> > > > > void test(int recurseCount)
> > > > > {
> > > > >     if (recurseCount == 0) return;
> > > > >     int temp = 10;
> > > > >     globalIncrement(&temp);
> > > > >     test(recurseCount - 1);
> > > > > }
> > > > > ```
> > > > > 
> > > > > test - marked with Tail.
> > > > > globalIncrement - not marked with Tail. But TRE could be done since it does not capture pointer. But if it will capture the pointer then we could not do TRE. So we need to check !Tracker.EscapePoints.empty().
> > > > > 
> > > > > 
> > > > > 
> > > > > test - marked with Tail.
> > > > 
> > > > For the given code, TRE won't mark the recursive call "tail".  That transform isn't legal: the recursive call could access the caller's version of "temp".
> > > >For the given code, TRE won't mark the recursive call "tail". That transform isn't legal: the recursive call could access the caller's version of "temp".
> > > 
> > > it looks like recursive call could NOT access the caller's version of "temp":
> > > 
> > > ```
> > > test(recurseCount - 1);
> > > ```
> > > 
> > > Caller`s version of temp is accessed by non-recursive call:
> > > 
> > > ```
> > > globalIncrement(&temp);
> > > ```
> > > 
> > > If globalIncrement does not capture the "&temp" then TRE looks to be legal for that case. 
> > > 
> > > globalIncrement() would not be marked with "Tail". test() would be marked with Tail.
> > > 
> > > Thus the pre-requisite for TRE would be: tail-recursive call must not receive pointer to local stack(Tail) and non-recursive calls must not capture the pointer to local stack.
> > Can you give a complete IR example where we infer "tail", but TRE is illegal?
> > 
> > Can you give a complete IR example, we we don't infer "tail", but we still do the TRE transform here?
> >Can you give a complete IR example where we infer "tail", but TRE is illegal?
> 
> there is no such example. Currently all cases where we  infer "tail" would be valid for TRE.
> 
> >Can you give a complete IR example, we we don't infer "tail", but we still do the TRE transform here?
> 
> For the following example current code base would not infer "tail" for _Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
> This patch changes this behavior: so that if _Z15globalIncrementPKi is not marked with "tail" and does not capture its pointer argument - TRE would be allowed for _Z4testi.
> 
> 
> ```
> @count = dso_local local_unnamed_addr global i32 0, align 4
> 
> ; Function Attrs: nofree noinline norecurse nounwind uwtable
> define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
> entry:
>   %0 = load i32, i32* %param, align 4
>   %1 = load i32, i32* @count, align 4
>   %add = add nsw i32 %1, %0
>   store i32 %add, i32* @count, align 4
>   ret void
> }
> 
> ; Function Attrs: nounwind uwtable
> define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
> entry:
>   %temp = alloca i32, align 4
>   %cmp = icmp eq i32 %recurseCount, 0
>   br i1 %cmp, label %return, label %if.end
> 
> if.end:                                           ; preds = %entry
>   %0 = bitcast i32* %temp to i8*
>   call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
>   store i32 10, i32* %temp, align 4
>   call void @_Z15globalIncrementPKi(i32* nonnull %temp)
>   %sub = add nsw i32 %recurseCount, -1
>   call void @_Z4testi(i32 %sub)
>   call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
>   br label %return
> 
> return:                                           ; preds = %entry, %if.end
>   ret void
> }
> 
> ; Function Attrs: argmemonly nounwind willreturn
> declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
> 
> ; Function Attrs: argmemonly nounwind willreturn
> declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
> 
> attributes #0 = { nofree noinline norecurse nounwind uwtable }
> attributes #1 = { nounwind uwtable }
> attributes #2 = { argmemonly nounwind willreturn }
> 
> ```
In your example, we don't infer "tail" for globalIncrement... but we do infer it for the recursive call to test().  I'm suggesting you could just check for "tail" on test(), instead of using AllocaDerivedValueTracker.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085





More information about the cfe-commits mailing list