<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Nov 25, 2014 at 3:09 PM, Kaylor, Andrew <span dir="ltr"><<a href="mailto:andrew.kaylor@intel.com" target="_blank">andrew.kaylor@intel.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">


<div lang="EN-US" link="blue" vlink="purple">

<div><span class="">

<p class="MsoNormal">> We should also think about how to call std::terminate when cleanup dtors throw. The current representation for Itanium is inefficient. As a strawman, I propose making @__clang_call_terminate an intrinsic:<u></u><u></u></p>

</span><p class="MsoNormal">…<u></u><u></u></p>

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">That sounds like a good starting point.<u></u><u></u></span></p><span class="">

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>

<p class="MsoNormal">> Chandler expressed strong concerns about this design, however, as @llvm.eh.get_capture_block adds an ordering constraint on CodeGen. Once you add this intrinsic, we *have* to do frame layout of @_Z13do_some_thingRi *before* we can emit

 code for all the callers of @llvm.eh.get_capture_block. Today, this is easy, because module order defines emission order, but in the great glorious future, codegen will hopefully be parallelized, and then we've inflicted this horrible constraint on the innocent.<u></u><u></u></p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal">> His suggestion to break the ordering dependence was to lock down the frame offset of the capture block to always be some fixed offset known by the target (ie ebp - 4 on x86, if we like that).<span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>

</span><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">Chandler probably has a better feel for this sort of thing than I do.  I can’t think of a reason offhand why that wouldn’t work, but it makes me a little nervous.<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> </span> </p></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">What would that look like in the IR?  Would we use the same intrinsics and just lower them to use the known location?</span></p></div></div></blockquote><div><br></div><div>Chandler seems to be OK with get/set capture block, as long as the codegen ordering dependence can be removed. I think we can remove it by delaying the resolution of the frame offset to assembly time using an MCSymbolRef. It would look a lot like this kind of assembly:</div><div><br></div><div>my_handler:</div><div>  push %rbp</div><div>  mov %rsp, %rbp</div><div>  lea Lframe_offset0(%rdx), %rax ; This is now the parent capture block</div><div>  ...</div><div>  retq</div><div><br></div><div>parent_fn:</div><div>  push %rbp</div><div>  mov %rsp, %rbp<br></div><div>  push %rbx</div><div>  push %rdi<br></div><div>  subq $NN, %rsp</div><div>Lframe_offset0 = X + 2 * 8 ; Two CSRs plus some offset into the main stack allocation</div><div><br></div><div>I guess I'll try to make that work.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple">

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">I’ll think about this, but for now I’m happy to just proceed with the belief that it’s a solvable problem either way.<u></u><u></u></span></p><span class="">

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black">>> For C++ exception handling, we need cleanup code that executes before the catch handlers and cleanup

 code that excutes in the case on uncaught exceptions.  I think both of these need to be outlined for the MSVC environment. Do you think we need a stub handler to be inserted in cases where no actual cleanup is performed?</span><span style="color:black"><u></u><u></u></span></p>

<p class="MsoNormal">> I think it's actually harder than that, once you consider nested trys:<u></u><u></u></p>

<p class="MsoNormal">> void f() {<u></u><u></u></p>

<p class="MsoNormal">>  try {<u></u><u></u></p>

<p class="MsoNormal">>    Outer outer;<u></u><u></u></p>

<p class="MsoNormal">>    try {<u></u><u></u></p>

<p class="MsoNormal">>      Inner inner;<u></u><u></u></p>

<p class="MsoNormal">>      g();<u></u><u></u></p>

<p class="MsoNormal">>    } catch (int) {<u></u><u></u></p>

<p class="MsoNormal">>      // ~Inner gets run first<br>

>    }<u></u><u></u></p>

<p class="MsoNormal">>  } catch (float) {<u></u><u></u></p>

<p class="MsoNormal">>    // ~Inner gets run first<u></u><u></u></p>

<p class="MsoNormal">>    // ~Outer gets run next<br>

>  }<u></u><u></u></p>

<p class="MsoNormal">>  // uncaught exception? Run ~Inner then ~Outer.<br>

> }<u></u><u></u></p>

<p class="MsoNormal"><u></u> <u></u></p>

</span><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">I took a look at the IR that’s generated for this example.  I see what you mean.  So there is potentially cleanup code before and after every catch handler,

 right?<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">Do you happen to know offhand what that looks like in the .xdata for the _CxxFrameHandler3 function?</span></p></div></blockquote><div><br></div><div>I can't tell how the state tables arrange for the destructors to run in the right order, but they can accomplish this without duplicating the cleanup code into the outlined catch handler functions, which is nice.</div><div><br></div><div>I think we may be able to address this by emitting calls to start/stop intrinsics around EH cleanups, but that may inhibit optimizations.</div></div></div></div>