[PATCH] D15139: [IR] Reformulate LLVM's EH funclet IR

David Majnemer via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 11 00:47:56 PST 2015


majnemer added inline comments.

================
Comment at: docs/ExceptionHandling.rst:617-618
@@ -620,4 +616,4 @@
 they must be the first non-phi instruction of a basic block that may be the
-unwind destination of an invoke: ``catchpad``, ``cleanuppad``, and
-``terminatepad``. As with landingpads, when entering a try scope, if the
+unwind destination of an invoke: ``catchswitch``, ``catchpad``, ``cleanuppad``,
+and ``terminatepad``. As with landingpads, when entering a try scope, if the
 frontend encounters a call site that may throw an exception, it should emit an
----------------
Done.

================
Comment at: docs/ExceptionHandling.rst:632-633
@@ -635,22 +631,4 @@
 the function control is returned to.  A cleanup handler which reaches its end
 by normal execution executes a ``cleanupret`` instruction, which is a terminator
-indicating where the active exception will unwind to next.  A catch or cleanup
-handler which is exited by another exception being raised during its execution will
-unwind through a ``catchendpad`` or ``cleanuupendpad`` (respectively).  The
-``catchendpad`` and ``cleanupendpad`` instructions are considered "exception
-handling pads" in the same sense that ``catchpad``, ``cleanuppad``, and
-``terminatepad`` are.
-
-Each of these new EH pad instructions has a way to identify which
-action should be considered after this action. The ``catchpad`` and
-``terminatepad`` instructions are terminators, and have a label operand considered
-to be an unwind destination analogous to the unwind destination of an invoke. The
-``cleanuppad`` instruction is different from the other two in that it is not a
-terminator. The code inside a cleanuppad runs before transferring control to the
-next action, so the ``cleanupret`` and ``cleanupendpad`` instructions are the
-instructions that hold a label operand and unwind to the next EH pad. All of
-these "unwind edges" may refer to a basic block that contains an EH pad instruction,
-or they may simply unwind to the caller. Unwinding to the caller has roughly the
-same semantics as the ``resume`` instruction in the ``landingpad`` model. When
-inlining through an invoke, instructions that unwind to the caller are hooked
-up to unwind to the unwind destination of the call site.
+indicating where the active exception will unwind to next.
+
----------------
`cleanupret` and `catchret` are similar in that they both exit their constituent funclet.  I'll keep it as `cleanupret` unless there is an overwhelming desire to move in that direction.

================
Comment at: docs/ExceptionHandling.rst:761
@@ +760,3 @@
+  catch.dispatch2:                                  ; preds = %catch
+    %2 = catchswitch within %1 [label %catch3] unwind to caller
+
----------------
Done.

================
Comment at: docs/LangRef.rst:5342
@@ -5353,1 +5341,3 @@
+``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
+this operand may be the token ``none``.
 
----------------
JosephTremoulet wrote:
> I can see the merit to keeping that 1-1 correspondence, but honestly then I'd rather achieve that by omitting the `None` token from both representations for top-level pads, so that we don't have to think about it in the common case that we're looking at or building a top-level pad, and so that someone trying to grok the IR concepts doesn't need to puzzle over it before understanding the nested cases, at the cost of a bit of complexity in the constructors that we'll mostly never look at again and also the cost of having top-level <-> nested rewrites be a replacement rather than a mutation.  I don't feel strongly enough to insist, though.
I considered this approach while designing the new instructions (i.e. making `getOuterScope` return `nullptr` if it has no lexical parent).  I rejected it for a few reasons:
- It makes it all too easy to get the nesting wrong if no token is necessary.  Things will seem to work until they don't.  Making the token always explicit is a way of making IR creators consider the nesting case.
- It provides us a sentinel value which we can use when wanting to talk about scopes.  `nullptr` is often used to indicate failure and I wanted to make sure this confusion would never occur.

Regardless, we can do this change down the road at any time.

================
Comment at: docs/LangRef.rst:8618-8631
@@ -8758,12 +8617,16 @@
 -  A basic block that is not a cleanup block may not include a
    '``cleanuppad``' instruction.
--  All '``cleanupret``'s and '``cleanupendpad``'s which consume a ``cleanuppad``
-   must have the same exceptional successor.
--  It is undefined behavior for control to transfer from a ``cleanuppad`` to a
-   ``ret`` without first executing a ``cleanupret`` or ``cleanupendpad`` that
-   consumes the ``cleanuppad``.
--  It is undefined behavior for control to transfer from a ``cleanuppad`` to
-   itself without first executing a ``cleanupret`` or ``cleanupendpad`` that
-   consumes the ``cleanuppad``.
+
+Executing a ``cleanuppad`` instruction constitutes "entering" that pad.
+The pad may then be "exited" in one of three ways:
+1)  explicitly via a ``cleanupret`` that consumes it.  Executing such a ``cleanupret``
+    is undefined behavior if any descendant pads have been entered but not yet
+    exited.
+2)  implicitly via a call (which unwinds all the way to the current function's caller),
+    or via a ``catchswitch``, ``cleanupret``, or ``terminatepad`` that unwinds to caller
+3)  implicitly via an unwind edge whose destination EH pad isn't a descendant of
+    the ``cleanuppad``.  When the ``cleanuppad`` is exited in this manner, it is
+    undefined behavior if the destination EH pad has a parent which is not an
+    ancestor of the ``cleanuppad`` being exited.
 
 Example:
----------------
JosephTremoulet wrote:
> 1 - I don't think the " Catchpad should say something similar, and cleanupret and catchret should either refer to this or describe themselves that it's UB to exit the pad that's not the current top of the handler stack" part is done.
> 
> 2 - We also need to explain that all of these exits must have the same target.  Are you still intending to tag invokes in funclets with bundles?  If so, this agreement can be a verifier rule.  If not, it's a verifier rule that the cleanuprets have to match, will have to be UB if an invoke doesn't match the cleanupret (and I don't know what we do if there is no cleanupret but there are invokes that go to different places), and could be either UB or a verifier rule that nested catchswitches/cleanuprets/terminatepads have to match the cleanuprets.
#1 is done.

With regard to #2, we will tag call-sites in funclets with bundles but this does not allow us to enforce this as a verifier rule.  Tail-merging can still make invokes with conflicting/illegal unwind destinations show up in other funclets.  The tagging will allow us to get rid of them in WinEHPrepare.

================
Comment at: docs/LangRef.rst:5368
@@ +5367,3 @@
+    dispatch1:
+      %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
+    dispatch2:
----------------
Done.

================
Comment at: docs/LangRef.rst:5430
@@ +5429,3 @@
+    dispatch:
+      %cs = catchswitch within none [label %handler0] unwind to caller
+      ;; A catch block which can catch an integer.
----------------
Done.

================
Comment at: docs/LangRef.rst:8579
@@ -8718,3 +8578,3 @@
 
       <resultval> = cleanuppad [<args>*]
 
----------------
Done.

================
Comment at: include/llvm/Analysis/CFG.h:92
@@ -91,2 +91,3 @@
                                     const LoopInfo *LI = nullptr);
+
 } // End llvm namespace
----------------
Done.

================
Comment at: include/llvm/Bitcode/LLVMBitCodes.h:425
@@ -427,1 +424,3 @@
+    FUNC_CODE_INST_CATCHSWITCH = 53, // CATCHSWITCH: [num,args...] or [num,args...,bb]
+    // 54 is unused.
     FUNC_CODE_OPERAND_BUNDLE = 55, // OPERAND_BUNDLE: [tag#, value...]
----------------
Bitcode is supposed to be stable within reason;  we are breaking compatibility because it is not sensible to add heroic logic to the auto-upgrader to make these instructions: nobody should have MSVC++ EH bitcodes around.

It is fair game for `54` to be reused because, for all intents and purposes, it has never been used except by a tiny number of people.

Operand bundles are currently under development by other folks, it would be in poor taste to renumber them under their feet.

================
Comment at: include/llvm/IR/Instructions.h:4164
@@ +4163,3 @@
+  /// \brief Methods for support type inquiry through isa, cast, and dyn_cast:
+  static inline bool classof(const Instruction *I) {
+    return I->getOpcode() == Instruction::CatchPad;
----------------
Done.

================
Comment at: include/llvm/IR/Instructions.h:4220
@@ +4219,3 @@
+  Value *getOuterScope() const {
+    return getCatchPad()->getCatchSwitch()->getOuterScope();
+  }
----------------
`getTargetScope` and `getDestinationScope` make you feel like they imply the existence of CFG edges which aren't really there.  I'll leave this alone unless a compelling name shows up.

================
Comment at: lib/Analysis/CFG.cpp:21
@@ -19,3 +20,3 @@
 
 using namespace llvm;
 
----------------
Done.

================
Comment at: lib/Analysis/EHPersonalities.cpp:12
@@ -11,1 +11,3 @@
 #include "llvm/ADT/StringSwitch.h"
+#include "llvm/IR/CFG.h"
+#include "llvm/IR/Constants.h"
----------------
Yes, for llvm::successors.

================
Comment at: lib/Analysis/EHPersonalities.cpp:52-57
@@ +51,8 @@
+
+  // Build up the color map, which maps each block to its set of 'colors'.
+  // For any block B, the "colors" of B are the set of funclets F (possibly
+  // including a root "funclet" representing the main function), such that
+  // F will need to directly contain B or a copy of B (where the term "directly
+  // contain" is used to distinguish from being "transitively contained" in
+  // a nested funclet).
+
----------------
Done.

================
Comment at: lib/Analysis/EHPersonalities.cpp:54
@@ +53,3 @@
+  // For any block B, the "colors" of B are the set of funclets F (possibly
+  // including a root "funclet" representing the main function), such that
+  // F will need to directly contain B or a copy of B (where the term "directly
----------------
andrew.w.kaylor wrote:
> JosephTremoulet wrote:
> > rnk wrote:
> > > I think the comment means that the list of colors may include the entry block, which isn't strictly speaking a funclet.
> > Right, I'm pretty sure I wrote the comment, and that's what I meant -- that the colors *for any given block* may or may not include the root.
> Oh, I see.  For some reason I read it as saying that the complete set of funclets for the function could "possibly" include the entry block, and it confused me because that will always be a color.  I see now that it is actually saying that the set of colors for any given block may (or may not) include the entry block.  That makes a lot more sense.
> 
> As a minor nit, the commas in this sentence shouldn't be there.  I'm going to blame my reading comprehension failure on that.
> 
> At the risk of sounding pedantic, it is a little ambiguous as to whether 'F' refers to the set of funclets that are colors for B or a single funclet in that set.  From context it has to be a single funclet, but grammatically I think it's the set.  I suggest this:
> 
> "For any block B the 'colors' of B are the set of funclets (possibly including a root 'funclet' representing the main function) such that each funclet F in that set will need to directly contain B or a copy of B...."
Done.

================
Comment at: lib/Analysis/ScalarEvolutionExpander.cpp:102-103
@@ -103,8 +101,4 @@
     } else if (auto *TPI = dyn_cast<TerminatePadInst>(IP)) {
       IP = TPI->getUnwindDest()->getFirstNonPHI()->getIterator();
-    } else if (auto *CEPI = dyn_cast<CatchEndPadInst>(IP)) {
-      IP = CEPI->getUnwindDest()->getFirstNonPHI()->getIterator();
-    } else if (auto *CEPI = dyn_cast<CleanupEndPadInst>(IP)) {
-      IP = CEPI->getUnwindDest()->getFirstNonPHI()->getIterator();
-    } else if (isa<CatchPadInst>(IP)) {
+    } else if (isa<CatchSwitchInst>(IP)) {
       IP = MustDominate->getFirstInsertionPt();
----------------
This code cannot fire if the terminatepad unwinds to caller.  It is only possible to get here if we have a PHI on the terminatepad (or one of it's predecessors) and the PHI's use is dominated by the terminatepad.  Such a use shouldn't exist if the terminatepad unwinds to caller.

================
Comment at: lib/AsmParser/LLParser.cpp:2371-2372
@@ -2402,21 +2370,4 @@
   if (Val) {
     // Check operator constraint.
-    switch (OC) {
-    case OC_None:
-      // no constraint
-      break;
-    case OC_CatchPad:
-      if (!isa<CatchPadInst>(Val)) {
-        P.Error(Loc, "'%" + Twine(ID) + "' is not a catchpad");
-        return nullptr;
-      }
-      break;
-    case OC_CleanupPad:
-      if (!isa<CleanupPadInst>(Val)) {
-        P.Error(Loc, "'%" + Twine(ID) + "' is not a cleanuppad");
-        return nullptr;
-      }
-      break;
-    }
     if (Val->getType() == Ty) return Val;
     if (Ty->isLabelTy())
----------------
Done.

================
Comment at: lib/CodeGen/AsmPrinter/WinException.cpp:799-804
@@ +798,8 @@
+             FuncInfo, FuncletStart, FuncletEnd, BaseState)) {
+      // Compute the label to report as the start of this entry; use the EH
+      // start
+      // label for the invoke if we have one, otherwise (this is a call which
+      // may
+      // unwind to our caller and does not have an EH start label, so) use the
+      // previous end label.
+      const MCSymbol *ChangeLabel = StateChange.NewStartLabel;
----------------
JosephTremoulet wrote:
> Looks like clang-format mangled this comment's line wrapping
Done.

================
Comment at: lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp:228-234
@@ -227,9 +227,9 @@
       }
-      if (isa<CatchEndPadInst>(I) || isa<CleanupEndPadInst>(I)) {
+      if (isa<CatchSwitchInst>(I)) {
         assert(&*BB->begin() == I &&
                "WinEHPrepare failed to remove PHIs from imaginary BBs");
         continue;
       }
-      if (isa<CatchPadInst>(I) || isa<CleanupPadInst>(I))
+      if (isa<FuncletPadInst>(I))
         assert(&*BB->begin() == I && "WinEHPrepare failed to demote PHIs");
     }
----------------
The PHIs in the IR are not the same as the ones we need in the machine CFG.  This is because invokes have a single successor in the IR but multiple successors in the machine CFG.  To disable the demotion, we would need to fix the PHIs as we build the machine CFG.

================
Comment at: lib/CodeGen/WinEHPrepare.cpp:169
@@ +168,3 @@
+        dyn_cast<FuncletPadInst>(FuncletEntryBB->getFirstNonPHI());
+    if (!FuncletPad)
+      FuncletUnwindDest = nullptr;
----------------
andrew.w.kaylor wrote:
> Am I right in thinking that you could assert(FuncletPad || FuncletEntryBB == &Fn->getEntryBlock()) here?
Done.

================
Comment at: lib/CodeGen/WinEHPrepare.cpp:219
@@ +218,3 @@
+  const BasicBlock *BB = FirstNonPHI->getParent();
+  assert(BB->isEHPad() && "no a funclet!");
+
----------------
andrew.w.kaylor wrote:
> s/no/not
Done.

================
Comment at: lib/CodeGen/WinEHPrepare.cpp:261
@@ -290,1 +260,3 @@
+    // cleanupret instructions.
+    if (FuncInfo.EHPadStateMap.count(CleanupPad))
       return;
----------------
andrew.w.kaylor wrote:
> With the new implementation, I don't think it is possible to get here twice for the same cleanup pad.  So this could be made an assertion, right?
It still is because two cleanuprets from the same cleanuppad can unwind to the same funclet-pad.  We will then do getEHPadFromPredecessor and visit the same cleanup twice.

================
Comment at: lib/IR/Instructions.cpp:936
@@ -1011,1 +935,3 @@
+bool CatchSwitchInst::inOutermostScope() const {
+  return isa<ConstantTokenNone>(getOuterScope()) && unwindsToCaller();
 }
----------------
Done.

================
Comment at: lib/IR/Instructions.cpp:1002
@@ -1017,3 +1001,3 @@
 }
 
 //===----------------------------------------------------------------------===//
----------------
Done.

================
Comment at: lib/IR/Verifier.cpp:401-402
@@ -400,4 +400,4 @@
   void visitCleanupPadInst(CleanupPadInst &CPI);
-  void visitCleanupEndPadInst(CleanupEndPadInst &CEPI);
+  void visitCatchSwitchInst(CatchSwitchInst &CatchSwitch);
   void visitCleanupReturnInst(CleanupReturnInst &CRI);
   void visitTerminatePadInst(TerminatePadInst &TPI);
----------------
This is UB, langref is updated.

================
Comment at: lib/IR/Verifier.cpp:2850
@@ +2849,3 @@
+             "Block containg CatchPadInst must be jumped to "
+             "only by it's catchswitch.",
+             CPI);
----------------
Done.

================
Comment at: lib/IR/Verifier.cpp:3011
@@ -3037,1 +3010,3 @@
+         OuterScope);
 
+  visitTerminatorInst(CatchSwitch);
----------------
andrew.w.kaylor wrote:
> Shouldn't this be verifying the unwind destination?
Done.

================
Comment at: lib/IR/Verifier.cpp:3022
@@ -3043,3 +3021,3 @@
     Instruction *I = UnwindDest->getFirstNonPHI();
     Assert(I->isEHPad() && !isa<LandingPadInst>(I),
            "CleanupReturnInst must unwind to an EH block which is not a "
----------------
andrew.w.kaylor wrote:
> This should also exclude CatchPadInst.
visitEHPadPredecessors should do this for us when we visit CatchPadInst.

================
Comment at: lib/IR/Verifier.cpp:3048
@@ -3069,3 +3047,3 @@
     Instruction *I = UnwindDest->getFirstNonPHI();
     Assert(I->isEHPad() && !isa<LandingPadInst>(I),
            "TerminatePadInst must unwind to an EH block which is not a "
----------------
andrew.w.kaylor wrote:
> This should also exclude CatchPadInst.
visitEHPadPredecessors should do this for us when we visit CatchPadInst.

================
Comment at: lib/Transforms/Utils/InlineFunction.cpp:371
@@ +370,3 @@
+      Replacement->takeName(I);
+      I->eraseFromParent();
+      UpdatePHINodes(&*BB);
----------------
andrew.w.kaylor wrote:
> This needs an RAUW for the catchswitch case.
Done.

================
Comment at: lib/Transforms/Utils/InlineFunction.cpp:1433
@@ +1432,3 @@
+  // Update the lexical scopes of the new funclets.  Anything that had 'none' as
+  // it's parent is now nested inside the callsite's EHPad.
+  if (CallSiteEHPad) {
----------------
Done.

================
Comment at: lib/Transforms/Utils/InlineFunction.cpp:1443-1451
@@ +1442,11 @@
+      if (auto *TPI = dyn_cast<TerminatePadInst>(I)) {
+        if (CallSiteEHPad && isa<ConstantTokenNone>(TPI->getOuterScope()))
+          TPI->setOuterScope(CallSiteEHPad);
+      } else if (auto *CatchSwitch = dyn_cast<CatchSwitchInst>(I)) {
+        if (CallSiteEHPad &&
+            isa<ConstantTokenNone>(CatchSwitch->getOuterScope()))
+          CatchSwitch->setOuterScope(CallSiteEHPad);
+      } else {
+        auto *FPI = cast<FuncletPadInst>(I);
+        if (CallSiteEHPad && isa<ConstantTokenNone>(FPI->getOuterScope()))
+          FPI->setOuterScope(CallSiteEHPad);
----------------
JosephTremoulet wrote:
> I think you can remove `CallSiteEHPad && ` from each of these three tests, as it's redundant with the guarding check up on line 1434.
Done.

================
Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:3276
@@ +3275,3 @@
+      DestParentFunclet = CSI->getOuterScope();
+    else if (auto *FPI = dyn_cast<FuncletPadInst>(DestEHPad))
+      DestParentFunclet = FPI->getOuterScope();
----------------
andrew.w.kaylor wrote:
> Can a cleanupret ever unwind to a catchpad?  If not, this should be CleanupPadInst.
Done.

================
Comment at: test/CodeGen/WinEH/wineh-cloning.ll:276
@@ +275,3 @@
+; CHECK:     inner:
+; CHECK:       [[I_R:\%.+]] = cleanuppad within none []
+; CHECK:       call void @h(i32 %x1.wineh.reload)
----------------
andrew.w.kaylor wrote:
> Here, on the other hand, I don't think we need to generalize, since this block won't be cloned.
Done.

================
Comment at: test/CodeGen/WinEH/wineh-cloning.ll:281
@@ -306,3 +280,3 @@
 
-define void @test7() personality i32 (...)* @__CxxFrameHandler3 {
+define void @test7() personality i32 (...)* @__C_specific_handler {
 entry:
----------------
andrew.w.kaylor wrote:
> Is there any reason to keep this test case?  There doesn't seem to be anything interesting happening here.
Done.

================
Comment at: test/CodeGen/WinEH/wineh-cloning.ll:326
@@ -363,3 +325,3 @@
 
-define void @test8() personality i32 (...)* @__CxxFrameHandler3 {
+define void @test8() personality i32 (...)* @__C_specific_handler {
 entry:
----------------
andrew.w.kaylor wrote:
> This test case also seems to be redundant now.
Done.

================
Comment at: test/CodeGen/WinEH/wineh-cloning.ll:383
@@ -434,3 +382,3 @@
 
-define void @test9() personality i32 (...)* @__CxxFrameHandler3 {
+define void @test9() personality i32 (...)* @__C_specific_handler {
 entry:
----------------
andrew.w.kaylor wrote:
> It bothers me that this is still apparently legal.  I would think that an exception thrown from within a top level EH pad should only be able to unwind to caller.  In any event, the test is now verifying that we did not resolve the problem that the old implementation of this test was looking for.
> 
> What would the state numbering code do with this if we left it in the state that is checked here?
There is no problem here because funclet ancestry is explicit.  It should not be necessary for us to do any cloning.  It is up to the front-end to emit IR that the personality routine is capable of.  However, this example is impossible to construct from the front-end.

================
Comment at: test/CodeGen/WinEH/wineh-cloning.ll:479
@@ -552,3 +478,3 @@
 }
 ; merge.end will get cloned for outer and inner, but is implausible
 ; from inner, so the invoke @f() in inner's copy of merge should be
----------------
andrew.w.kaylor wrote:
> This comment is incorrect now, since there is no invoke in %merge anymore.  I think this test case is also obsolete.
Done.


http://reviews.llvm.org/D15139





More information about the llvm-commits mailing list