<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Jun 11, 2016 at 5:09 PM, Gor Nishanov <span dir="ltr"><<a href="mailto:gornishanov@gmail.com" target="_blank">gornishanov@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">HI Eli:<br>
<span><br>
>> coro.barrier() doesn't work: if the address of the alloca doesn't escape,<br>
>> alias analysis will assume the barrier can't read or write the value of<br>
>> the alloca, so the barrier doesn't actually block code movement.<br>
<br>
</span>Got it. I am new to this and learning a lot over the course<br>
of this thread. Thank you for being patient with me.<br>
<br>
Two questions and one clarification:<br>
<br>
Q1: Do we have to have a load here?<br>
===================================<br>
<br>
>> block1:<br>
>> %first_time = load... <--- What are we loading here?<br></blockquote><div><br></div><div>Just an local alloca, initialized to false, and changed to true in the return block.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<span>>> br i1 %first_time, label return, label suspend1<br>
>><br>
</span>>> supend1:<br>
>> %0 = coro.suspend()<br>
>> switch %0 (resume1, destroy1)<br>
<br>
Can we use three way coro.suspend instead?<br>
<br>
Block1:<br>
%0 = call i8 coro.suspend()<br>
switch i8 %0, label suspend1 [i8 0 %return] ; or icmp + br i1<br>
Suspend1:<br>
switch i8 %0, label %resume1 [i8 1 %destroy1] ; or icmp + br i1<br></blockquote><div><br></div><div>This doesn't look right: intuitively the suspend happens after the return block runs.<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
One problem I can see is that someone can write a pass that might merge<br>
two branches / switches into one switch and we are back where we were.<br>
I guess what you meant by load, is to call some coro.is.first.time() intrinsic.<br>
So it looks like:<br>
<br>
>> block1:<br>
>> %first_time = call i1 coro.is.first.time()<br>
<span>>> br i1 %first_time, label return, label suspend1<br>
>><br>
</span>>> supend1:<br>
>> %0 = coro.suspend()<br>
>> switch %0 (resume1, destroy1)<br>
<br>
This looks fine, there may be more uses for this intrinsic in the frontend.<br>
Killing two birds with one stone. Good.<br></blockquote><div><br></div><div>It doesn't really matter whether the bit gets tracked in an alloca or through intrinsics.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Question 2: Why the switch in the return block?<br>
===============================================<br>
<br>
I would think that **pre-split** return block would be simply:<br>
<br>
return:<br>
<run dtors for parameters, if required><br>
<conversion ops for ret value, if required><br>
<ret void> or <ret whatever><br>
<br>
Where and why I should put a switch that you mentioned in this return block?<br>
<br>
BTW, I am speaking of the return block as if it is one block,<br>
but, it could be a dominating block over all the blocks that together<br>
run the destructors, do return value conversion, etc.<br></blockquote><div><br></div><div>The best way to be sure the compiler will understand the control flow is if the coroutine acts like a normal function. Another way to put it is that it should be possible to lower a coroutine to a thread rather than performing the state machine transformation. <br><br></div><div>The switch answers the question of where the control flow actually goes after the return block runs. Under normal function semantics, the "return" block doesn't actually return: it just performs the one-time operations, then jumps back to the suspend call. Therefore, you can't use "ret" there; you have to connect the control flow back to the correct suspend call. The switch makes that connection. So the return block looks like this:<br><br> <run dtors for parameters, if required><br>
<conversion ops for ret value, if required><br></div><div> call coro.first_time_ret_value(value) ; CoroSplit replaces this with a ret<br></div><div> switch ... ; jump to suspend; this is always dead in the lowered version<br><br></div><div>The dead switch is there so the optimizer will understand the control flow.<br><br></div><div>And yes, this would be much more straightforward with a two-function approach.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Clarification:<br>
==============<br>
<span><br>
>> Also, if some non-C++ language wants to generate coroutines,<br>
>> it might not have to generate the return block at all.<br>
<br>
</span>C++ coroutines are flexible. The semantic of a coroutine is defined via<br>
traits, so you may define a coroutine that returns void. It does not have<br>
to return coroutine handle or some struct that wraps the coroutine handle.<br></blockquote><div><br></div><div>Oh, okay. I haven't actually looked at the spec; I'm mostly just going off your description of what it does.<br></div><div><br></div><div>-Eli<br></div></div></div></div>