<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 13, 2015 at 11:25 AM, Kaylor, Andrew <span dir="ltr"><<a href="mailto:andrew.kaylor@intel.com" target="_blank">andrew.kaylor@intel.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Comments below…<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Reid Kleckner [mailto:<a href="mailto:rnk@google.com" target="_blank">rnk@google.com</a>]
<br>
<b>Sent:</b> Friday, April 10, 2015 5:01 PM<span class=""><br>
<b>To:</b> Kaylor, Andrew<br>
<b>Cc:</b> David Majnemer <<a href="mailto:david.majnemer@gmail.com" target="_blank">david.majnemer@gmail.com</a>> (<a href="mailto:david.majnemer@gmail.com" target="_blank">david.majnemer@gmail.com</a>); LLVM Developers Mailing List<br>
<b>Subject:</b> Re: [WinEH] Cloning blocks that reference non-cloned PHI nodes<u></u><u></u></span></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">Yeah, wrapping up the whole selector comparison into one intrinsic would help. Thanks for the idea. :)<u></u><u></u></p><span class="">
<div>
<p class="MsoNormal"><br>
The current strategy also seems pretty fragile around complex cleanups. If the cleanup isn't some simple destructor call with easily traceable unconditional branch control flow, we get lost and can't find our way to the next selector comparison. Having an explicit
'resume' or some other intrinsic at the end of the cleanup would be pretty helpful here.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">---<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">We also talked a lot about making WinEHPrepare try to less work all at once, and split it into phases. This is separable from the landingswitch proposal, which is more about making it easier to identify handler start points. Here's a sketch
of the phases we came up with:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Step 1: Identify selector dispatch conditional branches. Identify all cleanup actions, and split basic blocks to make them start and end on a basic block boundary. Build a list of all basic blocks that start a handler.<u></u><u></u></p>
</div>
</span><div>
<p class="MsoNormal"><span style="color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">[Andy Kaylor] This sounds an awful lot like what we’re doing now except for the block splitting part. I guess that you are talking about doing this for all
landing pads before we go any further, right? I agree that the current code to identify cleanup blocks is too fragile. We can probably fix it though. Intrinsics to mark the start and/or end of cleanup code would help immensely.</span></p></div></div></div></div></blockquote><div><br></div><div>Yep, this is basically the selector pattern matching that we do today. Maybe the block splitting isn't necessary, but I feel like it would be nice to say that every handler starts at a BB and no two handlers start at the same BB. Today it is possible for a cleanup and catch handler to start at the same BB in this example:</div><div><br></div><div>some_eh_bb:</div><div> call void @dtor()</div><div> ... typeid...</div><div> %matches = icmp eq ...</div><div> br i1 %matches, label %more_dispatch, label %catch_int</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><p class="MsoNormal"></p>
</div><span class="">
<div>
<p class="MsoNormal">Step 2: Do a forwards-backwards dataflow analysis to mark basic blocks as reachable and unreachable from each handler start block. The forwards part is an easy DFS, the backwards part is basically about pruning unreachable blocks like 'ret'
from a handler. We can probably skip the backwards part initially.<u></u><u></u></p>
</div>
</span><div>
<p class="MsoNormal"><span style="color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">[Andy Kaylor] I have some concerns about this part. When blocks have been merged so that handlers are sharing blocks with non-exception code things are going
to get fairly messy. The existing cloning code takes care of this (in theory at least) by scrubbing things after resolving PHIs based on which blocks didn’t get cloned. I suppose this is a potential fault line if the code we didn’t want loops back into the
shared block. I’m guessing that this is the sort of problem you’re thinking of that has you wanting to change things. Do you have a test case that exposes the problem? More importantly, do you have a solution in mind?</span></p></div></div></blockquote><div><br></div><div>Specifically, I'm dealing with __finally blocks today, which have this control flow structure of branch to shared cleanup, branch on phi out to EH or normal flow. This made WinEHPrepare pretty unhappy. I need some way to fix this. I've been toying with doing the __finally outlining in Clang, but if we're serious about making WinEHPrepare handle this, then maybe we don't need that.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">With catch handlers we have clear markers for the beginning and end of the catch (possibly including multiple end calls). Having similar markers for cleanup
would help a lot. I still think there’s potential for getting lost.</span></p></div></blockquote><div><br></div><div>We can add markers, but I'm worried that it might be an optimization barrier and we could still get lost. What differentiates the end of one cleanup from the end of another? Should the end cleanup call consume an SSA value produced by the begin cleanup call maybe? I guess nesting catches isn't a problem because the only way to enter another catch is to do an invoke, which goes via a landingpad, which we don't outline. The same should be true for cleanup end markers.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><div><span class="">
<div>
<p class="MsoNormal">Step 3: Clone all blocks that are reachable from more than one handler until all blocks belong to exactly one handler, and sink code from landingpads back into handlers if necessary. We don't have many code commoning optimizations, so this
step is more of a defense against theoretical future optimization passes that do more than tail merging.<u></u><u></u></p>
</div>
</span><div>
<p class="MsoNormal"><span style="color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">[Andy Kaylor] I’m not sure I see how this is better than the current cloning approach.</span></p></div></div></div></div></blockquote><div><br></div><div>These last 3 steps are basically an optimization. Cloning and then deleting all IR reachable from landingpads, which we currently do at -O0, is fairly wasteful. When no inlining occurs at -O0 and -O1, the IR should be easily analyzable, and we just need to splice the handler BBs into new functions and rewrite uses of allocas.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><div>
</div><span class="">
<div>
<p class="MsoNormal">Step 4: Demote all SSA values live from one handler into or out of another. The only SSA values referenceable across handlers are static allocas.<u></u><u></u></p>
</div>
</span><div>
<p class="MsoNormal"><span style="color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">[Andy Kaylor] Again, if I underststand correctly we’re doing this now, just with a fairly distributed mechanism.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
</div><span class="">
<div>
<p class="MsoNormal">Step 5: Extract the blocks belonging to each handler into their own functions. Rewrite static alloca references in each new handler to use llvm.framerecover.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Step 6: Profit?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Does this seem like a reasonable plan? IMO simplifying the overarching design is a higher priority right now than fixing individual test cases.<u></u><u></u></p>
</div>
</span></div>
<div>
<p class="MsoNormal"><span style="color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">[Andy Kaylor] I think there’s significant value in getting some core suite of tests working so that we can get out of our current pattern of rewriting all of
our tests every time we make a significant change and into a situation where we’ll have some stable tests that will verify that any redesign isn’t breaking something. I also think that in chasing down problems with the implementation we have now we’ll learn
more about the problems that need to be solved.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I don’t have a strong objection to your working on a redesign while I try to work through some bugs in the current implementation, but I’d like to ask you to
hold off on committing significant restructuring for a little while unless I run into something that can’t be solved within the current design. I feel like I can have the small test suite that I’m working on passing within a week or so.</span></p></div></div></blockquote><div><br></div><div>Sure. By breaking things down into steps, I'm not trying to fundamentally change anything like I was with landingswitch. This is just trying to separate IR analysis from IR transformation so that we can reason about things a bit more easily.</div></div></div></div>