[PATCH] D80951: [GlobalOpt] Remove preallocated calls when possible

Tue Jun 16 18:06:01 PDT 2020

efriedma accepted this revision.
efriedma added a comment.
This revision is now accepted and ready to land.

LGTM

================
Comment at: llvm/lib/Transforms/IPO/GlobalOpt.cpp:2334
+
+    // FIXME: This doesn't handle invoke
+    Builder.SetInsertPoint(NewCB->getNextNonDebugInstruction());
----------------
aeubanks wrote:
> efriedma wrote:
> > "This doesn't handle invoke" is rough...
> > 
> > In the normal destination, you can stick a stackrestore as the first instruction (breaking the critical edge if necessary).  In the unwind destination, I'm not sure.  For Itanium-style unwinding, you could just stick it immediately after the landingpad, but I'm not sure what you can do on Windows; you can't stackrestore in a funclet.
> > 
> > I guess ultimately, we really want to use a static alloca (in the entry block) in most cases, but it's not safe in all cases, as we've discussed before.
> > 
> > At the very least, this probably should bail out somehow, not crash.
> Bailed out for any invokes.
> 
> I've been trying to understand how inalloca handles this and still don't quite understand.
> 
> 
> ```
> void bar2() {
>  for (int i = 0; i < 1000000; ++i) {
>   try {
>    foo(A(), 5);
>   } catch (int) {}
>  }
> }
> ```
> 
> becomes
> 
> ```
> define dso_local void @"?bar2@@YAXXZ"() local_unnamed_addr #0 personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*) {
> entry:
>   %agg.tmp.ensured = alloca %struct.A, align 8
>   br label %for.body
> 
> for.cond.cleanup:                                 ; preds = %for.inc
>   ret void
> 
> for.body:                                         ; preds = %for.inc, %entry
>   %i.05 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
>   %inalloca.save = call i8* @llvm.stacksave()
>   %argmem = alloca inalloca <{ %struct.A*, %struct.A, i32 }>, align 4
>   %0 = getelementptr inbounds <{ %struct.A*, %struct.A, i32 }>, <{ %struct.A*, %struct.A, i32 }>* %argmem, i32 0, i32 1
>   %call = call x86_thiscallcc %struct.A* @"??0A@@QAE at XZ"(%struct.A* nonnull %0) #2
>   %1 = getelementptr inbounds <{ %struct.A*, %struct.A, i32 }>, <{ %struct.A*, %struct.A, i32 }>* %argmem, i32 0, i32 0
>   store %struct.A* %agg.tmp.ensured, %struct.A** %1, align 4
>   %2 = getelementptr inbounds <{ %struct.A*, %struct.A, i32 }>, <{ %struct.A*, %struct.A, i32 }>* %argmem, i32 0, i32 2
>   store i32 5, i32* %2, align 4, !tbaa !3
>   %call1 = invoke %struct.A* @"?foo@@YA?AUA@@U1 at H@Z"(<{ %struct.A*, %struct.A, i32 }>* inalloca nonnull %argmem)
>           to label %invoke.cont unwind label %catch.dispatch
> 
> catch.dispatch:                                   ; preds = %for.body
>   %3 = catchswitch within none [label %catch] unwind to caller
> 
> catch:                                            ; preds = %catch.dispatch
>   %4 = catchpad within %3 [%rtti.TypeDescriptor2* @"??_R0H at 8", i32 0, i8* null]
>   catchret from %4 to label %for.inc
> 
> for.inc:                                          ; preds = %invoke.cont, %catch
>   %inc = add nuw nsw i32 %i.05, 1
>   %exitcond = icmp eq i32 %inc, 1000000
>   br i1 %exitcond, label %for.cond.cleanup, label %for.body
> 
> invoke.cont:                                      ; preds = %for.body
>   call void @llvm.stackrestore(i8* %inalloca.save)
>   call x86_thiscallcc void @"??1A@@QAE at XZ"(%struct.A* nonnull %agg.tmp.ensured) #2
>   br label %for.inc
> }
> ```
> 
> The `@llvm.stackrestore()` only gets called in the non-exceptional case, but I verified that this doesn't leak stack memory when `foo()` always throws. I'm not sure how that happens.
You're not testing what you want to test; the problem isn't when the function with the inalloca argument itself throws. The behavior is obvious in that case: the alloca gets consumed by the call.  The issue is what happens if an exception gets thrown before that.  Expanding out your example:

```
struct A { A() { throw 1; } ~A(); };
void foo(A a, int z); // Never gets called
void bar() {
 for (int i = 0; i < 1000000; ++i) {
  try {
   foo(A(), 5);
  } catch (int) {}
 }
}
```

I'm pretty sure with inalloca, the memory just leaks.  I guess preallocated has the same problem,

Not sure how we solve it with preallocated.  I guess in the backend, we need to apply some sort of adjustment after a catchret.  Not sure how we compute the size of that adjustment, though.  I guess we need to note on each catchret/cleanupret which allocations die at that point.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80951/new/

https://reviews.llvm.org/D80951