[PATCH] D99227: [Coroutine][Clang] Force emit lifetime intrinsics for Coroutines
Xun Li via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Mar 23 16:53:46 PDT 2021
lxfind created this revision.
Herald added subscribers: ChuanqiXu, hoy, modimo, wenlei.
lxfind requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.
tl;dr Correct implementation of Corouintes requires having lifetime intrinsics available.
Coroutine functions are functions that can be suspended and resumed latter. To do so, data that need to stay alive after suspension must be put on the heap (i.e. the coroutine frame).
The optimizer is responsible for analyzing each AllocaInst and figure out whether it should be put on the stack or the frame.
In most cases, for data that we are unable to accurately analyze lifetime, we can just conservatively put them on the heap.
Unfortunately, there exists a few cases where certain data MUST be put on the stack, not on the heap. Without lifetime intrinsics, we are unable to correctly analyze those data's lifetime.
To dig into more details, there exists cases where at certain code points, the current coroutine frame may have already been destroyed. Hence no frame access would be allowed beyond that point.
The following is a common code pattern called "Symmetric Transfer" in coroutine:
auto tmp = await_suspend();
__builtin_coro_resume(tmp.address());
return;
In the above code example, `await_suspend()` returns a new coroutine handle, which we will obtain the address and then resume that coroutine. This essentially "transfered" from the current coroutine to a different coroutine.
During the call to `await_suspend()`, the current coroutine may be destroyed, which should be fine because we are not accessing any data afterwards.
However when LLVM is emitting IR for the above code, it needs to emit an AllocaInst for `tmp`. It will then call the `address` function on tmp. `address` function is a member function of coroutine, and there is no way for the LLVM optimizer to know that it does not capture the `tmp` pointer. So when the optimizer looks at it, it has to conservatively assume that `tmp` may escape and hence put it on the heap. Furthermore, in some cases `address` call would be inlined, which will generate a bunch of store/load instructions that move the `tmp` pointer around. Those stores will also make the compiler to think that `tmp` might escape.
To summarize, it's really difficult for the mid-end to figure out that the `tmp` data is short-lived.
I made some attempt in D98638 <https://reviews.llvm.org/D98638>, but it appears to be way too complex and is basically doing the same thing as inserting lifetime intrinsics in coroutines.
Also, for reference, we already force emitting lifetime intrinsics in O0 for AlwaysInliner: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Passes/PassBuilder.cpp#L1893
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D99227
Files:
clang/lib/CodeGen/CGDecl.cpp
clang/lib/CodeGen/CGExpr.cpp
clang/lib/CodeGen/CodeGenFunction.cpp
clang/test/CodeGenCoroutines/coro-symmetric-transfer-01.cpp
Index: clang/test/CodeGenCoroutines/coro-symmetric-transfer-01.cpp
===================================================================
--- clang/test/CodeGenCoroutines/coro-symmetric-transfer-01.cpp
+++ clang/test/CodeGenCoroutines/coro-symmetric-transfer-01.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fcoroutines-ts -std=c++14 -O1 -emit-llvm %s -o - -disable-llvm-passes | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fcoroutines-ts -std=c++14 -O0 -emit-llvm %s -o - -disable-llvm-passes | FileCheck %s
#include "Inputs/coroutine.h"
@@ -50,8 +50,13 @@
// check that the lifetime of the coroutine handle used to obtain the address is contained within single basic block, and hence does not live across suspension points.
// CHECK-LABEL: final.suspend:
-// CHECK: %[[PTR1:.+]] = bitcast %"struct.std::experimental::coroutines_v1::coroutine_handle.0"* %[[ADDR_TMP:.+]] to i8*
-// CHECK-NEXT: call void @llvm.lifetime.start.p0i8(i64 8, i8* %[[PTR1]])
-// CHECK: call i8* @{{.*address.*}}(%"struct.std::experimental::coroutines_v1::coroutine_handle.0"* {{[^,]*}} %[[ADDR_TMP]])
-// CHECK-NEXT: %[[PTR2:.+]] = bitcast %"struct.std::experimental::coroutines_v1::coroutine_handle.0"* %[[ADDR_TMP]] to i8*
-// CHECK-NEXT: call void @llvm.lifetime.end.p0i8(i64 8, i8* %[[PTR2]])
+// CHECK: %{{.+}} = call token @llvm.coro.save(i8* null)
+// CHECK: %[[HDL_CAST1:.+]] = bitcast %"struct.std::experimental::coroutines_v1::coroutine_handle.0"* %[[HDL:.+]] to i8*
+// CHECK: call void @llvm.lifetime.start.p0i8(i64 8, i8* %[[HDL_CAST1]])
+// CHECK: %[[CALL:.+]] = call i8* @_ZN13detached_task12promise_type13final_awaiter13await_suspendENSt12experimental13coroutines_v116coroutine_handleIS0_EE(
+// CHECK: %[[HDL_CAST2:.+]] = getelementptr inbounds %"struct.std::experimental::coroutines_v1::coroutine_handle.0", %"struct.std::experimental::coroutines_v1::coroutine_handle.0"* %[[HDL]], i32 0, i32 0
+// CHECK: store i8* %[[CALL]], i8** %[[HDL_CAST2]], align 8
+// CHECK: %[[HDL_TRANSFER:.+]] = call i8* @_ZNKSt12experimental13coroutines_v116coroutine_handleIvE7addressEv(%"struct.std::experimental::coroutines_v1::coroutine_handle.0"* nonnull dereferenceable(8) %[[HDL]])
+// CHECK: %[[HDL_CAST3:.+]] = bitcast %"struct.std::experimental::coroutines_v1::coroutine_handle.0"* %[[HDL]] to i8*
+// CHECK: call void @llvm.lifetime.end.p0i8(i64 8, i8* %[[HDL_CAST3]])
+// CHECK: call void @llvm.coro.resume(i8* %[[HDL_TRANSFER]])
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===================================================================
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1313,7 +1313,8 @@
// Initialize helper which will detect jumps which can cause invalid lifetime
// markers.
- if (Body && ShouldEmitLifetimeMarkers)
+ // Coroutines always emit lifetime markers.
+ if (Body && (ShouldEmitLifetimeMarkers || isa<CoroutineBodyStmt>(Body)))
Bypasses.Init(Body);
// Emit the standard function prologue.
Index: clang/lib/CodeGen/CGExpr.cpp
===================================================================
--- clang/lib/CodeGen/CGExpr.cpp
+++ clang/lib/CodeGen/CGExpr.cpp
@@ -535,7 +535,8 @@
break;
case SD_FullExpression: {
- if (!ShouldEmitLifetimeMarkers)
+ // Coroutine relies on lifetime markers to properly place data.
+ if (!ShouldEmitLifetimeMarkers && !isCoroutine())
break;
// Avoid creating a conditional cleanup just to hold an llvm.lifetime.end
Index: clang/lib/CodeGen/CGDecl.cpp
===================================================================
--- clang/lib/CodeGen/CGDecl.cpp
+++ clang/lib/CodeGen/CGDecl.cpp
@@ -1317,7 +1317,8 @@
/// otherwise
llvm::Value *CodeGenFunction::EmitLifetimeStart(uint64_t Size,
llvm::Value *Addr) {
- if (!ShouldEmitLifetimeMarkers)
+ // Coroutine relies on lifetime markers to properly place data.
+ if (!ShouldEmitLifetimeMarkers && !isCoroutine())
return nullptr;
assert(Addr->getType()->getPointerAddressSpace() ==
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D99227.332826.patch
Type: text/x-patch
Size: 4209 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20210323/2e7b3f4a/attachment.bin>
More information about the cfe-commits
mailing list