[PATCH] Add eh.begincatch and eh.endcatch intrinsics

Wed Feb 4 12:38:46 PST 2015

I'm happy to move the verification to the lint pass.  Its main usefulness is definitely for verifying the output from the front end.


REPOSITORY
  rL LLVM

================
Comment at: docs/ExceptionHandling.rst:328
@@ +327,3 @@
+
+The Windows runtime function for C++ exception handling uses a mutli-phase
+approach.  When an exception occurs it searches the current callstack for a
----------------
rnk wrote:
> "multi"
> 
> Would you say this is a two-phase approach? That's how I've always thought of it.
It's certainly reasonable to think of it that way, but the runtime is doing enough behind the curtain that I decided to use this as a sort of hand-waving term.  There are essentially two phases that are visible for our purposes, and maybe that's reason enough to describe it that way.  I was just thinking of the other things that are going on inside the runtime.

================
Comment at: docs/ExceptionHandling.rst:331
@@ +330,3 @@
+frame that has a handler for the exception.  If a handler is found, it then
+calls the unwind handler for each frame above the handler which has an
+unwind handler before calling the catch handler.  These calls are all made
----------------
rnk wrote:
> I don't think "unwind handler" is a term used elsewhere in this document. "cleanup" would be consistent with the rest. Can you define the term if you want to use it in this section?
I was thinking that the Microsoft documentation for exception handling (which I should probably link to somewhere) called it an unwind handler, but I just looked and even though the flag is UNW_FLAG_UHANDLER they seem to call it a termination handler.  I'll give some thought to how I can best reconcile these things for clarity.

================
Comment at: docs/ExceptionHandling.rst:334-335
@@ +333,4 @@
+from a stack context different from the original frame in which the handler
+is defined.  Therefore, it is necessary for these handlers to be outlined
+from their original context before code generation.
+
----------------
rnk wrote:
> maybe "it is necessary to outline these handlers from their original context before code generation"?
OK

================
Comment at: docs/ExceptionHandling.rst:338
@@ +337,3 @@
+Catch handlers are called with a pointer to the handler itself as the first
+argument and a pointer to the parent frame's stack pointer as the second
+argument.  The catch handler uses the `llvm.recoverframe
----------------
rnk wrote:
> I guess the stack pointer is really whatever pointer is established by the Win64 unwind opcodes. If the unwind codes end in .setframe, it's that register plus the specified offset, and otherwise it's the stack pointer. I'm not sure if it's worth writing this down here.
I'll take a look and make sure I end up with accurate wording.

================
Comment at: docs/ExceptionHandling.rst:355
@@ +354,3 @@
+Unwind handlers perform any cleanup necessary as the frame goes out of scope,
+but the runtime handles the actual unwinding of the stack.  If an exception
+occurs in an unwind handler the runtime manages termination of the process.
----------------
rnk wrote:
> Maybe "... scope, while the runtime" or "and the runtime"? I don't think the second part of the sentence opposes the first.
This is probably a good reason not to use the term "unwind handler" because it seemed to me like people would be likely to expect the unwind handler to do some unwinding.

================
Comment at: docs/ExceptionHandling.rst:369-372
@@ +368,6 @@
+
+During the WinEHPrepare pass, the handler functions are outlined and the
+``landingpad`` instruction is followed by a call to the ``llvm.eh.actions``
+intrinsic which describes the order in which handlers will be processed from
+the logical location of the landing pad.  The ``llvm.eh.actions`` intrinsic is
+defined as returning the address at which execution will continue.  This is a
----------------
rnk wrote:
> I don't think it's quite clear what's going to be left in the landingpad afterwards. Maybe:
> 
> During the WinEHPrepare pass, landingpad code is outlined into handler functions. Only stub landingpad basic blocks are left in the IR. Each landingpad contains a landingpad instruction, a call to @llvm.eh.actions, and an indirectbr to indicate possible catch handler destinations.
> 
> Then the bit about what eh.actions holds.
An example would probably make it clear.

================
Comment at: lib/IR/Verifier.cpp:1052-1053
@@ -1048,1 +1051,4 @@
 
+/// \brief Verify that the begin catch intrinsic follows a landingpad and leads
+///        to an eh.endcatch.
+void Verifier::VerifyEHBeginCatch(CallInst &CI) {
----------------
rnk wrote:
> At a high level, I'm not sure we should be verifying these things. I always try to keep in mind a transform that tries to merge similar basic blocks, like this:
>   void f() {
>     if (b) {
>       try {
>       } catch (int) {
>         g(); g(); g();
>       }
>     } else {
>       g(); g(); g();
>     }
>   }
> 
> If we deduplicated the calls to g(), we would end up with a CFG where the begincatch doesn't dominate the endcatch:
>   g_bb:
>     %dst = phi i1 [0, %else], [1, %catch]
>     call void @g()
>     call void @g()
>     call void @g()
>     br i1 %dst, label %catch.cont, label %else.cont
> 
> Note that the outliner has to leave this bb behind in the parent function, and it will also have to leave behind a call to endcatch which is statically reachable from the entry block. We know after preparation, however, that endcatch is dynamically unreachable, so we could transform it to unreachable and run SimplifyCFG to clean it up.
> 
> I agree this is a good check to run on frontend generated, IR, though. Maybe we should move this to the Lint pass? Maybe clang should run the lint pass in asserts mode?
I see.  With a bit of effort I could figure out that the branch was using the phi and make the check work, but I suppose trying to generalize that would take way more code than I can justify for this.

================
Comment at: lib/IR/Verifier.cpp:1061
@@ +1060,3 @@
+  // to it must have come from a landing pad.
+  std::function<bool(BasicBlock *)> proceedsFromLPad = [&](BasicBlock *BB) {
+    VisitedBlocks.insert(BB);
----------------
rnk wrote:
> First, I think it would be clearer to simply hoist this into a static local function. The only state you need is VisitedBlocks.
> 
> Second, std::function is more expensive than using auto. std::function will make an unnecessary opaque wrapper class for the lambda, which we don't need.
OK.  It seemed like conceptually this was a good place for a lambda, then I had to look up the syntax for making recursion work.  I guess I let it get away from me a bit.

================
Comment at: lib/IR/Verifier.cpp:1068
@@ +1067,3 @@
+      return false;
+    for (pred_iterator I = pred_begin(BB), E = pred_end(BB); I != E; ++I) {
+      if (VisitedBlocks.find(*I) != VisitedBlocks.end())
----------------
rnk wrote:
> I added a predecessors() range-based-for adapter for this.
Nice.  I'll look that up.

================
Comment at: lib/IR/Verifier.cpp:1071
@@ +1070,3 @@
+        continue;
+      if (!proceedsFromLPad(*I))
+        return false;
----------------
rnk wrote:
> You should use a BB worklist. This causes O(n) recursion in the depth of the CFG, which could be large.
My thinking was that if the IR is correct we'll find the landing pad within a few blocks and if not then compilation is going to fail anyway, so I just wanted to keep this simple.

http://reviews.llvm.org/D7398

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/