[llvm] 494fe0b - [WebAssembly] Remove wasm-specific findWasmUnwindDestinations (#130374)

via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 10 20:56:42 PDT 2025


Author: Heejin Ahn
Date: 2025-03-10T20:56:38-07:00
New Revision: 494fe0b4145810d4e4e7b6003cabd194f76cb5d4

URL: https://github.com/llvm/llvm-project/commit/494fe0b4145810d4e4e7b6003cabd194f76cb5d4
DIFF: https://github.com/llvm/llvm-project/commit/494fe0b4145810d4e4e7b6003cabd194f76cb5d4.diff

LOG: [WebAssembly] Remove wasm-specific findWasmUnwindDestinations (#130374)

Unlike in Itanium EH IR, WinEH IR's unwinding instructions (e.g.
`invoke`s) can have multiple possible unwind destinations.

For example:
```ll
entry:
  invoke void @foo()
     to label %cont unwind label %catch.dispatch

catch.dispatch:                                ; preds = %entry
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:                                   ; preds = %catch.dispatch
  %1 = catchpad within %0 [ptr null]
  ...

terminate:                                     ; preds = %catch.dispatch
  %2 = catchpad within none []
  ...
...
```

In this case, if an exception is not caught by `catch.dispatch` (and
thus `catch.start`), it should next unwind to `terminate`.
`findUnwindDestination` in ISel gathers the list of this unwind
destinations traversing the unwind edges:

https://github.com/llvm/llvm-project/blob/ae42f071032b29821beef6a33771258086bbbb1c/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp#L2089-L2150
But we don't use that, and instead use our custom
`findWasmUnwindDestinations` that only adds the first unwind
destination, `catch.start`, to the successor list of `entry`, and not
`terminate`:

https://github.com/llvm/llvm-project/blob/ae42f071032b29821beef6a33771258086bbbb1c/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp#L2037-L2087

The reason behind it was, as described in the comment block in the code,
it was assumed that there always would be an `invoke` that connects
`catch.start` and `terminate`. In case of `catch (type)`, there will be
`call void @llvm.wasm.rethrow()` in `catch.start`'s predecessor that
unwinds to the next destination. For example:

https://github.com/llvm/llvm-project/blob/0db702ac8e06911478615ac537f75ac778817c04/llvm/test/CodeGen/WebAssembly/exception.ll#L429-L430
In case of `catch (...)`, `__cxa_end_catch` can throw, so it becomes an
`invoke` that unwinds to the next destination. For example:
https://github.com/llvm/llvm-project/blob/0db702ac8e06911478615ac537f75ac778817c04/llvm/test/CodeGen/WebAssembly/exception.ll#L537-L538
So the unwind ordering relationship between `catch.start` and
`terminate` here would be preserved.

But turns out this assumption does not always hold. For example:
```ll
entry:
  invoke void @foo()
     to label %cont unwind label %catch.dispatch

catch.dispatch:                                ; preds = %entry
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:                                   ; preds = %catch.dispatch
  %1 = catchpad within %0 [ptr null]
  ...
  call void @_ZSt9terminatev()
  unreachable

terminate:                                     ; preds = %catch.dispatch
  %2 = catchpad within none []
  call void @_ZSt9terminatev()
  unreachable
...
```
In this case there is no `invoke` that connects `catch.start` to
`terminate`. So after `catch.dispatch` BB is removed in ISel,
`terminate` is considered unreachable and incorrectly removed in DCE.

This makes Wasm just use the general `findUnwindDestination`. In that
case `entry`'s successor is going to be [`catch.start`, `terminate`]. We
can get the first unwind destination by just traversing the list from
the front.

---

This required another change in WinEHPrepare. WinEHPrepare demotes all
PHIs in EH pads because they are funclets in Windows and funclets can't
have PHIs. When used in Wasm they are not funclets so we don't need to
do that wholesale but we still need to demote PHIs in `catchswitch` BBs
because they are deleted during ISel. (So we created
[`-demote-catchswitch-only`](https://github.com/llvm/llvm-project/blob/a5588b6d20590a10db0f1a2046fba4d9f205ed68/llvm/lib/CodeGen/WinEHPrepare.cpp#L57-L59)
option for that)

But turns out we need to remove PHIs that have a `catchswitch` BB as an
incoming block too:
```ll
...
catch.dispatch:
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:
  ...

somebb:
  ...

ehcleanup                                      ; preds = %catch.dispatch, %somebb
  %1 = phi i32 [ 10, %catch.dispatch ], [ 20, %somebb ]
  ...
```
In this case the `phi` in `ehcleanup` BB should be demoted too because
`catch.dispatch` BB will be removed in ISel so one if its incoming block
will be gone. This pattern didn't manifest before presumably due to how
`findWasmUnwindDestinations` worked. (In this example, in our
`findWasmUnwindDestinations`, `catch.dispatch` would have had only one
successor, `catch.start`. But now `catch.dispatch` has both
`catch.start` and `ehcleanup` as successors, revealing this bug. This
case is
[represented](https://github.com/llvm/llvm-project/blob/ab87206c4b95aa0b5047facffb5f78f7fe6ac269/llvm/test/CodeGen/WebAssembly/exception.ll#L445)
by `rethrow_terminator` function in `exception.ll` (or
`exception-legacy.ll`) and without the WinEHPrepare fix it will crash.

---

Discovered by the reproducer provided in #126916, even though the bug
reported there was not this one.

Added: 
    

Modified: 
    llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
    llvm/lib/CodeGen/WinEHPrepare.cpp
    llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
    llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
    llvm/test/CodeGen/WebAssembly/exception-legacy.ll
    llvm/test/CodeGen/WebAssembly/exception.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 66dc86cd13e2d..14bb1d943d2d6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -2034,58 +2034,6 @@ void SelectionDAGBuilder::visitCleanupPad(const CleanupPadInst &CPI) {
   }
 }
 
-// In wasm EH, even though a catchpad may not catch an exception if a tag does
-// not match, it is OK to add only the first unwind destination catchpad to the
-// successors, because there will be at least one invoke instruction within the
-// catch scope that points to the next unwind destination, if one exists, so
-// CFGSort cannot mess up with BB sorting order.
-// (All catchpads with 'catch (type)' clauses have a 'llvm.rethrow' intrinsic
-// call within them, and catchpads only consisting of 'catch (...)' have a
-// '__cxa_end_catch' call within them, both of which generate invokes in case
-// the next unwind destination exists, i.e., the next unwind destination is not
-// the caller.)
-//
-// Having at most one EH pad successor is also simpler and helps later
-// transformations.
-//
-// For example,
-// current:
-//   invoke void @foo to ... unwind label %catch.dispatch
-// catch.dispatch:
-//   %0 = catchswitch within ... [label %catch.start] unwind label %next
-// catch.start:
-//   ...
-//   ... in this BB or some other child BB dominated by this BB there will be an
-//   invoke that points to 'next' BB as an unwind destination
-//
-// next: ; We don't need to add this to 'current' BB's successor
-//   ...
-static void findWasmUnwindDestinations(
-    FunctionLoweringInfo &FuncInfo, const BasicBlock *EHPadBB,
-    BranchProbability Prob,
-    SmallVectorImpl<std::pair<MachineBasicBlock *, BranchProbability>>
-        &UnwindDests) {
-  while (EHPadBB) {
-    BasicBlock::const_iterator Pad = EHPadBB->getFirstNonPHIIt();
-    if (isa<CleanupPadInst>(Pad)) {
-      // Stop on cleanup pads.
-      UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
-      UnwindDests.back().first->setIsEHScopeEntry();
-      break;
-    } else if (const auto *CatchSwitch = dyn_cast<CatchSwitchInst>(Pad)) {
-      // Add the catchpad handlers to the possible destinations. We don't
-      // continue to the unwind destination of the catchswitch for wasm.
-      for (const BasicBlock *CatchPadBB : CatchSwitch->handlers()) {
-        UnwindDests.emplace_back(FuncInfo.getMBB(CatchPadBB), Prob);
-        UnwindDests.back().first->setIsEHScopeEntry();
-      }
-      break;
-    } else {
-      continue;
-    }
-  }
-}
-
 /// When an invoke or a cleanupret unwinds to the next EH pad, there are
 /// many places it could ultimately go. In the IR, we have a single unwind
 /// destination, but in the machine CFG, we enumerate all the possible blocks.
@@ -2106,13 +2054,6 @@ static void findUnwindDestinations(
   bool IsWasmCXX = Personality == EHPersonality::Wasm_CXX;
   bool IsSEH = isAsynchronousEHPersonality(Personality);
 
-  if (IsWasmCXX) {
-    findWasmUnwindDestinations(FuncInfo, EHPadBB, Prob, UnwindDests);
-    assert(UnwindDests.size() <= 1 &&
-           "There should be at most one unwind destination for wasm");
-    return;
-  }
-
   while (EHPadBB) {
     BasicBlock::const_iterator Pad = EHPadBB->getFirstNonPHIIt();
     BasicBlock *NewEHPadBB = nullptr;
@@ -2122,10 +2063,13 @@ static void findUnwindDestinations(
       break;
     } else if (isa<CleanupPadInst>(Pad)) {
       // Stop on cleanup pads. Cleanups are always funclet entries for all known
-      // personalities.
+      // personalities except Wasm. And in Wasm this becomes a catch_all(_ref),
+      // which always catches an exception.
       UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
       UnwindDests.back().first->setIsEHScopeEntry();
-      UnwindDests.back().first->setIsEHFuncletEntry();
+      // In Wasm, EH scopes are not funclets
+      if (!IsWasmCXX)
+        UnwindDests.back().first->setIsEHFuncletEntry();
       break;
     } else if (const auto *CatchSwitch = dyn_cast<CatchSwitchInst>(Pad)) {
       // Add the catchpad handlers to the possible destinations.

diff  --git a/llvm/lib/CodeGen/WinEHPrepare.cpp b/llvm/lib/CodeGen/WinEHPrepare.cpp
index 253a3887009a2..37f6726bafca9 100644
--- a/llvm/lib/CodeGen/WinEHPrepare.cpp
+++ b/llvm/lib/CodeGen/WinEHPrepare.cpp
@@ -866,9 +866,6 @@ void WinEHPrepareImpl::demotePHIsOnFunclets(Function &F,
   for (BasicBlock &BB : make_early_inc_range(F)) {
     if (!BB.isEHPad())
       continue;
-    if (DemoteCatchSwitchPHIOnly &&
-        !isa<CatchSwitchInst>(BB.getFirstNonPHIIt()))
-      continue;
 
     for (Instruction &I : make_early_inc_range(BB)) {
       auto *PN = dyn_cast<PHINode>(&I);
@@ -876,6 +873,23 @@ void WinEHPrepareImpl::demotePHIsOnFunclets(Function &F,
       if (!PN)
         break;
 
+      // If DemoteCatchSwitchPHIOnly is true, we only demote a PHI when
+      // 1. The PHI is within a catchswitch BB
+      // 2. The PHI has a catchswitch BB has one of its incoming blocks
+      if (DemoteCatchSwitchPHIOnly) {
+        bool IsCatchSwitchBB = isa<CatchSwitchInst>(BB.getFirstNonPHIIt());
+        bool HasIncomingCatchSwitchBB = false;
+        for (unsigned I = 0, E = PN->getNumIncomingValues(); I < E; ++I) {
+          if (isa<CatchSwitchInst>(
+                  PN->getIncomingBlock(I)->getFirstNonPHIIt())) {
+            HasIncomingCatchSwitchBB = true;
+            break;
+          }
+        }
+        if (!IsCatchSwitchBB && !HasIncomingCatchSwitchBB)
+          break;
+      }
+
       AllocaInst *SpillSlot = insertPHILoads(PN, F);
       if (SpillSlot)
         insertPHIStores(PN, SpillSlot);

diff  --git a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
index 277d353d1db10..640be5fe8e8c9 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
@@ -1850,13 +1850,12 @@ bool WebAssemblyCFGStackify::fixCallUnwindMismatches(MachineFunction &MF) {
 
       // If the EH pad on the stack top is where this instruction should unwind
       // next, we're good.
-      MachineBasicBlock *UnwindDest = getFakeCallerBlock(MF);
+      MachineBasicBlock *UnwindDest = nullptr;
       for (auto *Succ : MBB.successors()) {
         // Even though semantically a BB can have multiple successors in case an
-        // exception is not caught by a catchpad, in our backend implementation
-        // it is guaranteed that a BB can have at most one EH pad successor. For
-        // details, refer to comments in findWasmUnwindDestinations function in
-        // SelectionDAGBuilder.cpp.
+        // exception is not caught by a catchpad, the first unwind destination
+        // should appear first in the successor list, based on the calculation
+        // in findUnwindDestinations() in SelectionDAGBuilder.cpp.
         if (Succ->isEHPad()) {
           UnwindDest = Succ;
           break;

diff  --git a/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
index 550d8b64dca35..2d11019291a7a 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
@@ -659,9 +659,6 @@ static bool canLongjmp(const Value *Callee) {
   // Every catchpad generated by Wasm C++ contains __cxa_end_catch, so we
   // intentionally treat it as longjmpable to work around this problem. This is
   // a hacky fix but an easy one.
-  //
-  // The comment block in findWasmUnwindDestinations() in
-  // SelectionDAGBuilder.cpp is addressing a similar problem.
   if (CalleeName == "__cxa_end_catch")
     return WebAssembly::WasmEnableSjLj;
   if (CalleeName == "__cxa_begin_catch" ||

diff  --git a/llvm/test/CodeGen/WebAssembly/exception-legacy.ll b/llvm/test/CodeGen/WebAssembly/exception-legacy.ll
index 3fe45bcc4cd29..b4ffd185e3ca8 100644
--- a/llvm/test/CodeGen/WebAssembly/exception-legacy.ll
+++ b/llvm/test/CodeGen/WebAssembly/exception-legacy.ll
@@ -171,7 +171,7 @@ invoke.cont2:                                     ; preds = %ehcleanup
 
 terminate:                                        ; preds = %ehcleanup
   %6 = cleanuppad within %5 []
-  call void @_ZSt9terminatev() [ "funclet"(token %6) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %6) ]
   unreachable
 }
 
@@ -477,7 +477,7 @@ invoke.cont.i:                                    ; preds = %catch.start.i
 
 terminate.i:                                      ; preds = %catch.start.i, %catch.dispatch.i
   %6 = cleanuppad within %0 []
-  call void @_ZSt9terminatev() [ "funclet"(token %6) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %6) ]
   unreachable
 
 _ZN4TempD2Ev.exit:                                ; preds = %invoke.cont.i
@@ -501,6 +501,50 @@ unreachable:                                      ; preds = %entry
   unreachable
 }
 
+; Regression test for a bug where, when an invoke unwinds to a catchswitch, the
+; catchswitch's unwind destination was not included in the invoke's unwind
+; destination when there was no direct link from catch.start to there.
+
+; CHECK-LABEL: unwind_destinations:
+; CHECK: try
+; CHECK:   try
+; CHECK:     call  foo
+; CHECK:   catch  $0=, __cpp_exception
+; CHECK:     call  _ZSt9terminatev
+; CHECK:     unreachable
+; CHECK:   end_try
+; Note the below is "terminate" BB and should not be DCE'd
+; CHECK: catch_all
+; CHECK:   call  _ZSt9terminatev
+; CHECK:   unreachable
+; CHECK: end_try
+; CHECK: return
+define void @unwind_destinations() personality ptr @__gxx_wasm_personality_v0 {
+entry:
+  invoke void @foo()
+          to label %try.cont unwind label %catch.dispatch
+
+catch.dispatch:                                   ; preds = %entry
+  %0 = catchswitch within none [label %catch.start] unwind label %terminate
+
+catch.start:                                      ; preds = %catch.dispatch
+  %1 = catchpad within %0 [ptr null]
+  %2 = call ptr @llvm.wasm.get.exception(token %1)
+  %3 = call ptr @__cxa_begin_catch(ptr %2) #5 [ "funclet"(token %1) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %1) ]
+  unreachable
+
+; Even if there is no link from catch.start to this terminate BB, when there is
+; an exception that catch.start does not catch (e.g. a foreign exception), it
+; should end up here, so this BB should NOT be DCE'ed
+terminate:                                        ; preds = %catch.dispatch
+  %4 = cleanuppad within none []
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %4) ]
+  unreachable
+
+try.cont:                                         ; preds = %entry
+  ret void
+}
 
 declare void @foo()
 declare void @bar(ptr)
@@ -527,5 +571,6 @@ declare ptr @_ZN4TempD2Ev(ptr returned)
 
 attributes #0 = { nounwind }
 attributes #1 = { noreturn }
+attributes #2 = { noreturn nounwind }
 
 ; CHECK: __cpp_exception:

diff  --git a/llvm/test/CodeGen/WebAssembly/exception.ll b/llvm/test/CodeGen/WebAssembly/exception.ll
index 57d1f37c0039f..fc42f050a687e 100644
--- a/llvm/test/CodeGen/WebAssembly/exception.ll
+++ b/llvm/test/CodeGen/WebAssembly/exception.ll
@@ -207,7 +207,7 @@ invoke.cont2:                                     ; preds = %ehcleanup
 
 terminate:                                        ; preds = %ehcleanup
   %6 = cleanuppad within %5 []
-  call void @_ZSt9terminatev() [ "funclet"(token %6) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %6) ]
   unreachable
 }
 
@@ -542,7 +542,7 @@ invoke.cont.i:                                    ; preds = %catch.start.i
 
 terminate.i:                                      ; preds = %catch.start.i, %catch.dispatch.i
   %6 = cleanuppad within %0 []
-  call void @_ZSt9terminatev() [ "funclet"(token %6) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %6) ]
   unreachable
 
 _ZN4TempD2Ev.exit:                                ; preds = %invoke.cont.i
@@ -593,6 +593,56 @@ try.cont:                                         ; preds = %catch, %entry
   ret void
 }
 
+; Regression test for a bug where, when an invoke unwinds to a catchswitch, the
+; catchswitch's unwind destination was not included in the invoke's unwind
+; destination when there was no direct link from catch.start to there.
+
+; CHECK-LABEL: unwind_destinations:
+; CHECK: block
+; CHECK:   block
+; CHECK:     try_table    (catch_all 0)
+; CHECK:       block     i32
+; CHECK:         try_table    (catch __cpp_exception 0)
+; CHECK:           call  foo
+; CHECK:         end_try_table
+; CHECK:         unreachable
+; CHECK:       end_block
+; CHECK:       call  _ZSt9terminatev
+; CHECK:       unreachable
+; CHECK:     end_try_table
+; CHECK:     unreachable
+; CHECK:   end_block
+; Note the below is "terminate" BB and should not be DCE'd
+; CHECK:   call  _ZSt9terminatev
+; CHECK:   unreachable
+; CHECK: end_block
+define void @unwind_destinations() personality ptr @__gxx_wasm_personality_v0 {
+entry:
+  invoke void @foo()
+          to label %try.cont unwind label %catch.dispatch
+
+catch.dispatch:                                   ; preds = %entry
+  %0 = catchswitch within none [label %catch.start] unwind label %terminate
+
+catch.start:                                      ; preds = %catch.dispatch
+  %1 = catchpad within %0 [ptr null]
+  %2 = call ptr @llvm.wasm.get.exception(token %1)
+  %3 = call ptr @__cxa_begin_catch(ptr %2) #5 [ "funclet"(token %1) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %1) ]
+  unreachable
+
+; Even if there is no link from catch.start to this terminate BB, when there is
+; an exception that catch.start does not catch (e.g. a foreign exception), it
+; should end up here, so this BB should NOT be DCE'ed
+terminate:                                        ; preds = %catch.dispatch
+  %4 = cleanuppad within none []
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %4) ]
+  unreachable
+
+try.cont:                                         ; preds = %entry
+  ret void
+}
+
 declare void @foo()
 declare void @bar(ptr)
 declare void @take_i32(i32)
@@ -618,5 +668,6 @@ declare ptr @_ZN4TempD2Ev(ptr returned)
 
 attributes #0 = { nounwind }
 attributes #1 = { noreturn }
+attributes #2 = { noreturn nounwind }
 
 ; CHECK: __cpp_exception:


        


More information about the llvm-commits mailing list