[PATCH] D136285: Bad optimization with alloca and intrinsic function stackrestore

Jamie Schmeiser via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 27 13:23:00 PDT 2022


jamieschmeiser added a comment.

I have verified that adding inalloca and removing tail call on stackrestore in my sample IR will fix the problem.

I am still unclear about inalloca, where it actually gets specified and how it is used.  I am continuing to study this.  I am also unclear why some of the comments on the link (which does seem to be related) seem to indicate that there may not be a bug here (other than the tail call on stackrestore).  There is an observable change in behaviour caused by memcpyopt and by simplifycfg when it extends object lifetime by collapsing blocks (see reply to comment in testcase).



================
Comment at: llvm/test/Transforms/MemCpyOpt/stackrestore.ll:97
+  %SS = tail call ptr @llvm.stacksave()
+  %A2 = alloca [56 x i8], align 4
+  store i8 1, ptr %A2, align 4
----------------
rnk wrote:
> My expectation is that all the allocas here are static: They are part of the entry block, they will be part of the initial stack allocation, they will not be affected by stacksave/restore. This does mean that simplifycfg can extend the lifetime of a stack allocation, but to my knowledge, that's valid, the program should have the same observable behavior.
simplifycfg extending the lifetime of a stack allocation does have observable behaviour changes.  As pointed out in previous comments, it seems that stackrestore should not have tail on it, so consider the test IR without tail call on stackrestore and the code not in the entry block.

```
define dso_local signext i32 @test() {
associate006_entry:
  br label %b1

b1:
  %A1 = alloca [56 x i8], align 8
  %SS = tail call ptr @llvm.stacksave()
  %A2 = alloca [56 x i8], align 4
  store i8 1, ptr %A2, align 4
  %GEP1 = getelementptr inbounds i8, ptr %A2, i32 8
  store i8 1, ptr %GEP1, align 4
  %GEP2 = getelementptr inbounds i8, ptr %A2, i32 12
  store i8 1, ptr %GEP2, align 4
  call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 8 dereferenceable(56) %A1, ptr noundef nonnull align 4 dereferenceable(56) %A2, i32 56, i1 false)
  call void @llvm.stackrestore(ptr %SS)
  %A3 = alloca [56 x i8], align 4
  %uglygep123 = getelementptr i8, ptr %A3, i32 0
  call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(56) %uglygep123, ptr noundef nonnull align 8 dereferenceable(56) %A1, i32 56, i1 false)
  ret i32 0
}
```
This will not pass the isStaticAlloca query and (due to the removed tail call) on stackrestore, no optimization will be performed by memcpyopt.  If simplifycfg is performed first, the two blocks are collapsed, this becomes the entry block and memcpyopt will do the optimization, changing the observable behaviour.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136285/new/

https://reviews.llvm.org/D136285



More information about the llvm-commits mailing list