[PATCH] D22659: [coroutines] Part 2 of N: Adding Coroutine Intrinsics

Gor Nishanov via llvm-commits llvm-commits at lists.llvm.org
Sun Jul 24 08:23:07 PDT 2016


GorNishanov added inline comments.

================
Comment at: include/llvm/IR/Intrinsics.td:606-607
@@ +605,4 @@
+def int_coro_alloc : Intrinsic<[llvm_ptr_ty], [], [IntrNoMem]>;
+def int_coro_begin : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_i32_ty,
+                                llvm_ptr_ty, llvm_ptr_ty], []>;
+
----------------
majnemer wrote:
> What happens if there are two calls to llvm.coro.begin in a function?
**TLDR: **
  coro.begin(*,*,*, null) - must have exactly one, broken IR otherwise
  coro.begin(*,*,*, not-null) - can have as many or as little as inliner wants

**Longer explanation:**

The last argument of coro.begin is null when coming out of the frontend. Later, CoroSplit pass sets it to a pointer to a constant array containing pointers to outlined parts of the coroutine. Let's call a coro.begin with null "pre-slit coro.begin", and the one with a pointer to const array, "post-split coro.begin".

Coming out of the frontend, a coroutine has exactly one coro.begin. (Verifier can check for that). Later inliner can inline calls to other coroutines and it may bring more coro.begins into a function, but, those, will be always post-split, since CoroSplit pass runs together with the Inliner in the same CGPassManager.

CoroSplit pass only cares about pre-split coro.begin and there is exactly one. Other coro.begins are there for the benefit of CoroElide pass that devirtualizes and elides heap allocations where possible.

All intrinsics that stay in the coroutine after CoroSplit pass (coro.alloc and coro.free) are linked to coro.begin, so CoroElide pass knows which coroutine they belong to.

================
Comment at: include/llvm/IR/Intrinsics.td:609
@@ +608,3 @@
+
+def int_coro_free : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty], [IntrNoMem]>;
+def int_coro_end : Intrinsic<[], [llvm_ptr_ty, llvm_i1_ty], []>;
----------------
majnemer wrote:
> I don't know if `IntrNoMem` is right here but `NoCapture<0>` is probably fine.
Ack on NoCapture<0>. 
With respect to IntrNoMem my thinking was:

coro.free(%hdl) can be lowered in one of three ways:

1. null - if CoroElide pass elides dynamic allocations
2. %hdl - if coroutine handle points to the beginning of the memory block
3. gep on %hdl - that adjusts the pointer to point to the beginning of the memory block.

Wait, you are right, gep may not be a constant gep, if some of the allocas in the coroutine have alignment requirements that exceed the alignment guarantees provided by the allocation function (second parameter to coro.begin), then, indeed, coro.free would have to read some adjustment value from the memory pointed by handle to do the correct adjustment.

Okay, then, 
```
[IntrReadMem, NoCapture<0>]
```
it is.

================
Comment at: include/llvm/IR/Intrinsics.td:618-620
@@ +617,5 @@
+
+def int_coro_param : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty, llvm_ptr_ty],
+                               [IntrReadMem, ReadNone<0>, NoCapture<0>,
+                                ReadNone<1>,NoCapture<1>]>;
+
----------------
majnemer wrote:
> How is coro.param sensitive to non-parameter stores?
To double check:

You are asking why [IntrReadMem] and not [IntrNoMem]?

I think it is safe to mark it [IntrNoMem].
It is always lowered to a Boolean constant, and the only purpose for the first and second argument is to tie those two allocas together, in case, I want to elide parameter copy and need to RAUW one with another.

================
Comment at: include/llvm/IR/Intrinsics.td:627-629
@@ +626,5 @@
+def int_coro_done : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty], [IntrReadMem]>;
+def int_coro_promise : Intrinsic<[llvm_ptr_ty],
+                                 [llvm_ptr_ty, llvm_i32_ty, llvm_i1_ty],
+                                 [IntrNoMem]>;
+
----------------
majnemer wrote:
> NoMem seems wrong, what is your rationale?
> I think you can nocapture the pointer parameter.
Ack on NoCapture<0>.

It is NoReadMem, since it is always lowered to a constant gep.

================
Comment at: include/llvm/IR/Intrinsics.td:633-634
@@ +632,4 @@
+
+def int_coro_subfn_addr : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_i8_ty],
+                                    [IntrReadMem]>;
+
----------------
majnemer wrote:
> Is it possible for optimizations to come across this intrinsic.  If so, what semantics does it have?
The life of coro.subfn.addr looks like this:

CoroEarly replaces:

```
   call i8* llvm.coro.resume(i8* %Arg)
```
with
```
   %0 = call i8* llvm.coro.subfn.addr(i8* %Arg, i8 0)
   %1 = bitcast i8* %0 to void(i8*)*
   call fastcc void %1 (i8* %Arg)
```

CoroElide may replace coro.subfn.addr with a f.resume function pointer, so it will be constant folded to:

```
   call fastcc void @f.resume (i8* %Arg)
```

If CoroElide wasn't able to do devirtualize it, it will stay intact until CoroCleanup pass run at EP_OptimizerLast. CoroCleanup will replace it with: gep + load to get the address of the appropriate resume function at runtime.

0 - f.resume
1 - f.destroy

P.S.

Frontend does not emit coro.subfn.addr directly, as we would like to hide the details of the calling convention for the coroutine sub-functions.



Repository:
  rL LLVM

https://reviews.llvm.org/D22659





More information about the llvm-commits mailing list