[llvm] [WebAssembly] Split separate component LiveIntervals for TEEs (PR #131561)

Heejin Ahn via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 17 01:55:07 PDT 2025


https://github.com/aheejin updated https://github.com/llvm/llvm-project/pull/131561

>From a4817efce744960c723425221d4eb01d71741bef Mon Sep 17 00:00:00 2001
From: Heejin Ahn <aheejin at gmail.com>
Date: Sat, 15 Mar 2025 07:39:40 +0000
Subject: [PATCH] [WebAssembly] Split separate component LiveIntervals for TEEs

`MachineVerifier` requires that a virtual register's `LiveInterval` has
only one connected component:
https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/CodeGen/MachineVerifier.cpp#L3923-L3936
(This is different from SSA; there can be multiple defs for a register,
but those live intervals should somehow be _connected_. I am not very
sure why this rule exists, but this rule apparently has been there
forever since
https://github.com/llvm/llvm-project/commit/260fa289df07bed3b8531f20919596248b66d45f.)

---

And it turns out our `TEE` generation in RegStackify can create virtual
registers with `LiveInterval` with multiple disconnected segments.

This is how `TEE` generation works:
https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp#L613-L632

But it is possible that `Reg = INST` can also use `Reg`:
```
0   Reg = ...
    ...
1   Reg = INST ..., Reg, ...   // It both defs and uses 'Reg'
2   INST ..., Reg, ...
3   INST ..., Reg, ...
4   INST ..., Reg, ...
```
In this pseudocode, `Reg`'s live interval is `[0r,1r),[1r:4r)`, which
has two segments that are connected. But after the `TEE` transformation,
```
0   Reg = ...
    ...
1   DefReg = INST ..., Reg, ...
2   TeeReg, Reg = TEE_... DefReg
3   INST ..., TeeReg, ...
4   INST ..., Reg, ...
5   INST ..., Reg, ...
```
now `%0`'s live interval is `[0r,1r),[2r,4r)`, which are not connected
anymore.

In many cases, these split segments are connected by another segment
generated by a `PHI` (or a block start boundary that used to be a
`PHI`). For example, if there is a loop,
```
bb.0:
  successors: %bb.1
0   Reg = INST ...
    ...

bb.1:
  ; predecessors: %bb.1
    successors: %bb.1, %bb.2
1   DefReg = INST ..., Reg, ...
2   TeeReg, Reg = TEE_... DefReg
3   INST ..., TeeReg, ...
4   INST ..., Reg, ...
5   INST ..., Reg, ...
6   BR_IF bb.1

bb.2:
  ; predecessors: %bb.1
7   INST ...
```
The live interval would be `[0r,1B),[1B,1r),[2r,7B)` and these three
segments are classified as a single equivalence class because of the
segment `[1B,1r)` starting at the merging point (which used to be a
`PHI`) at the start of bb.1. But this kind of connection is not always
guaranteed. The method that determines whether all components are
connected, i.e., there is a single equivalence class in a `LiveRange`,
is `ConnectedVNInfoEqClasses::Classify`.

---

In RegStackify, there is a routine that splits live intervals into
multiple registers in some cases:
https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp#L511-L517
It looks this was copied from
https://github.com/llvm/llvm-project/blob/dc9a183ac6aa2d087ceac56970255b06c4772ca3/llvm/lib/CodeGen/RegisterCoalescer.cpp#L341-L353.

But `LiveIntervals::shrinkToUses` does not return true for all
unconnected live intervals. I don't understand all the details of that
function or why `RegisterCoalescer::shrinkToUses` was written that way,
but it looks it returns true when there are some dead components:
https://github.com/llvm/llvm-project/blob/926d980017d82dedb9eb50147a82fdfb01659f16/llvm/lib/CodeGen/LiveIntervals.cpp#L537-L540
And the case in the attached test case does not return true here, but
still has multiple unconnected components and does not pass
`MachineVerifier` unless they are split.

---

So this PR runs `LiveIntervals::splitSeparateComponents` regardless of
the return value of `LiveIntervals::shrinkToUses`.
`splitSeparateComponents` won't do anything unless there are multiple
unconnected components to split.

---

This is one of the bugs reported in #126916.
---
 .../WebAssembly/WebAssemblyRegStackify.cpp    |  9 ++--
 .../WebAssembly/tee-live-intervals.mir        | 41 +++++++++++++++++++
 2 files changed, 46 insertions(+), 4 deletions(-)
 create mode 100644 llvm/test/CodeGen/WebAssembly/tee-live-intervals.mir

diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
index 428d573668fb4..5ff3d5f760204 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
@@ -510,10 +510,11 @@ static unsigned getTeeOpcode(const TargetRegisterClass *RC) {
 
 // Shrink LI to its uses, cleaning up LI.
 static void shrinkToUses(LiveInterval &LI, LiveIntervals &LIS) {
-  if (LIS.shrinkToUses(&LI)) {
-    SmallVector<LiveInterval *, 4> SplitLIs;
-    LIS.splitSeparateComponents(LI, SplitLIs);
-  }
+  LIS.shrinkToUses(&LI);
+  // In case the register's live interval now has multiple unconnected
+  // components, split them into multiple registers.
+  SmallVector<LiveInterval *, 4> SplitLIs;
+  LIS.splitSeparateComponents(LI, SplitLIs);
 }
 
 /// A single-use def in the same block with no intervening memory or register
diff --git a/llvm/test/CodeGen/WebAssembly/tee-live-intervals.mir b/llvm/test/CodeGen/WebAssembly/tee-live-intervals.mir
new file mode 100644
index 0000000000000..b137e990932b0
--- /dev/null
+++ b/llvm/test/CodeGen/WebAssembly/tee-live-intervals.mir
@@ -0,0 +1,41 @@
+# RUN: llc -mtriple=wasm32-unknown-unknown -run-pass wasm-reg-stackify -verify-machineinstrs %s -o -
+
+# TEE generation in RegStackify can create virtual registers with LiveIntervals
+# with multiple disconnected segments, which is invalid in MachineVerifier. In
+# this test, '%0 = CALL @foo' will become a CALL and a TEE, which creates
+# unconnected split segments. This should be later split into multiple
+# registers. This test should not crash with -verify-machineinstrs, which checks
+# whether all segments within a register is connected. See ??? for the detailed
+# explanation.
+
+--- |
+  target triple = "wasm32-unknown-unknown"
+
+  declare ptr @foo(ptr returned)
+  define void @tee_live_intervals_test() {
+    ret void
+  }
+...
+---
+name: tee_live_intervals_test
+liveins:
+  - { reg: '$arguments' }
+tracksRegLiveness: true
+body: |
+  bb.0:
+    liveins: $arguments
+    successors: %bb.1, %bb.2
+    %0:i32 = ARGUMENT_i32 0, implicit $arguments
+    %1:i32 = CONST_I32 0, implicit-def dead $arguments
+    BR_IF %bb.2, %1:i32, implicit-def dead $arguments
+
+  bb.1:
+  ; predecessors: %bb.0
+    %0:i32 = CALL @foo, %0:i32, implicit-def dead $arguments, implicit $sp32, implicit $sp64
+    STORE8_I32_A32 0, 0, %0:i32, %1:i32, implicit-def dead $arguments
+    RETURN %0:i32, implicit-def dead $arguments
+
+  bb.2:
+  ; predecessors: %bb.0
+    %2:i32 = CONST_I32 0, implicit-def dead $arguments
+    RETURN %2:i32, implicit-def dead $arguments



More information about the llvm-commits mailing list