[llvm] [WebAssembly] Split separate component LiveIntervals for TEEs (PR #131561)
Heejin Ahn via llvm-commits
llvm-commits at lists.llvm.org
Sun Mar 16 21:58:58 PDT 2025
https://github.com/aheejin created https://github.com/llvm/llvm-project/pull/131561
`MachineVerifier` requires that a virtual register's `LiveInterval` has only one connected component:
https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/CodeGen/MachineVerifier.cpp#L3923-L3936 (This is different from SSA; there can be multiple defs for a register, but those live intervals should somehow be _connected_. I am not very sure why this rule exists, but this rule apparently has been there forever since
https://github.com/llvm/llvm-project/commit/260fa289df07bed3b8531f20919596248b66d45f.)
---
And it turns out our `TEE` generation in RegStackify can create virtual registers with `LiveInterval` with multiple disconnected segments.
This is how `TEE` generation works:
https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp#L613-L632
But it is possible that `Reg = INST` can also use `Reg`:
```
0 Reg = ...
...
1 Reg = INST ..., Reg, ... // It both defs and uses 'Reg'
2 INST ..., Reg, ...
3 INST ..., Reg, ...
4 INST ..., Reg, ...
```
In this pseudocode, `Reg`'s live interval is `[0r,1r),[1r:4r)`, which has two segments that are connected. But after the `TEE` transformation,
```
0 Reg = ...
...
1 DefReg = INST ..., Reg, ...
2 TeeReg, Reg = TEE_... DefReg
3 INST ..., TeeReg, ...
4 INST ..., Reg, ...
5 INST ..., Reg, ...
```
now `%0`'s live interval is `[0r,1r),[2r,4r)`, which are not connected anymore.
In many cases, these split segments are connected by another segment generated by a `PHI` (or a block start boundary that used to be a `PHI`). For example, if there is a loop,
```
bb.0:
successors: %bb.1
0 Reg = INST ...
...
bb.1:
; predecessors: %bb.1
successors: %bb.1, %bb.2
1 DefReg = INST ..., Reg, ...
2 TeeReg, Reg = TEE_... DefReg
3 INST ..., TeeReg, ...
4 INST ..., Reg, ...
5 INST ..., Reg, ...
6 BR_IF bb.1
bb.2:
; predecessors: %bb.1
7 INST ...
```
The live interval would be `[0r,1B),[1B,1r),[2r,7B)` and these three segments are classified as a single equivalence class because of the segment `[1B,1r)` starting at the merging point (which used to be a `PHI`) at the start of bb.1. But this kind of connection is not always guaranteed. The method that determines whether all components are connected, i.e., there is a single equivalence class in a `LiveRange`, is `[ConnectedVNInfoEqClasses::Classify`](https://github.com/llvm/llvm-project/blob/de03e102d1ea4da1c62b9ad735848d8869d08b44/llvm/lib/CodeGen/LiveInterval.cpp#L1329-L1369).
---
In RegStackify, there is a routine that splits live intervals into multiple registers in some cases:
https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp#L511-L517 It looks this was copied from
https://github.com/llvm/llvm-project/blob/dc9a183ac6aa2d087ceac56970255b06c4772ca3/llvm/lib/CodeGen/RegisterCoalescer.cpp#L341-L353.
But `LiveIntervals::shrinkToUses` does not return true for all unconnected live intervals. I don't understand all the details of that function or why `RegisterCoalescer::shrinkToUses` was written that way, but it looks it returns true when there are some dead components: https://github.com/llvm/llvm-project/blob/926d980017d82dedb9eb50147a82fdfb01659f16/llvm/lib/CodeGen/LiveIntervals.cpp#L537-L540 And the case in the attached test case does not return true here, but still has multiple unconnected components and does not pass `MachineVerifier` unless they are split.
---
So this PR runs `LiveIntervals::splitSeparateComponents` regardless of the return value of `LiveIntervals::shrinkToUses`. `splitSeparateComponents` [won't do anything](https://github.com/llvm/llvm-project/blob/dc9a183ac6aa2d087ceac56970255b06c4772ca3/llvm/lib/CodeGen/LiveIntervals.cpp#L1806-L1807) unless there are multiple unconnected components to split.
---
This is one of the bugs reported in #126916.
>From a4817efce744960c723425221d4eb01d71741bef Mon Sep 17 00:00:00 2001
From: Heejin Ahn <aheejin at gmail.com>
Date: Sat, 15 Mar 2025 07:39:40 +0000
Subject: [PATCH] [WebAssembly] Split separate component LiveIntervals for TEEs
`MachineVerifier` requires that a virtual register's `LiveInterval` has
only one connected component:
https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/CodeGen/MachineVerifier.cpp#L3923-L3936
(This is different from SSA; there can be multiple defs for a register,
but those live intervals should somehow be _connected_. I am not very
sure why this rule exists, but this rule apparently has been there
forever since
https://github.com/llvm/llvm-project/commit/260fa289df07bed3b8531f20919596248b66d45f.)
---
And it turns out our `TEE` generation in RegStackify can create virtual
registers with `LiveInterval` with multiple disconnected segments.
This is how `TEE` generation works:
https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp#L613-L632
But it is possible that `Reg = INST` can also use `Reg`:
```
0 Reg = ...
...
1 Reg = INST ..., Reg, ... // It both defs and uses 'Reg'
2 INST ..., Reg, ...
3 INST ..., Reg, ...
4 INST ..., Reg, ...
```
In this pseudocode, `Reg`'s live interval is `[0r,1r),[1r:4r)`, which
has two segments that are connected. But after the `TEE` transformation,
```
0 Reg = ...
...
1 DefReg = INST ..., Reg, ...
2 TeeReg, Reg = TEE_... DefReg
3 INST ..., TeeReg, ...
4 INST ..., Reg, ...
5 INST ..., Reg, ...
```
now `%0`'s live interval is `[0r,1r),[2r,4r)`, which are not connected
anymore.
In many cases, these split segments are connected by another segment
generated by a `PHI` (or a block start boundary that used to be a
`PHI`). For example, if there is a loop,
```
bb.0:
successors: %bb.1
0 Reg = INST ...
...
bb.1:
; predecessors: %bb.1
successors: %bb.1, %bb.2
1 DefReg = INST ..., Reg, ...
2 TeeReg, Reg = TEE_... DefReg
3 INST ..., TeeReg, ...
4 INST ..., Reg, ...
5 INST ..., Reg, ...
6 BR_IF bb.1
bb.2:
; predecessors: %bb.1
7 INST ...
```
The live interval would be `[0r,1B),[1B,1r),[2r,7B)` and these three
segments are classified as a single equivalence class because of the
segment `[1B,1r)` starting at the merging point (which used to be a
`PHI`) at the start of bb.1. But this kind of connection is not always
guaranteed. The method that determines whether all components are
connected, i.e., there is a single equivalence class in a `LiveRange`,
is `ConnectedVNInfoEqClasses::Classify`.
---
In RegStackify, there is a routine that splits live intervals into
multiple registers in some cases:
https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp#L511-L517
It looks this was copied from
https://github.com/llvm/llvm-project/blob/dc9a183ac6aa2d087ceac56970255b06c4772ca3/llvm/lib/CodeGen/RegisterCoalescer.cpp#L341-L353.
But `LiveIntervals::shrinkToUses` does not return true for all
unconnected live intervals. I don't understand all the details of that
function or why `RegisterCoalescer::shrinkToUses` was written that way,
but it looks it returns true when there are some dead components:
https://github.com/llvm/llvm-project/blob/926d980017d82dedb9eb50147a82fdfb01659f16/llvm/lib/CodeGen/LiveIntervals.cpp#L537-L540
And the case in the attached test case does not return true here, but
still has multiple unconnected components and does not pass
`MachineVerifier` unless they are split.
---
So this PR runs `LiveIntervals::splitSeparateComponents` regardless of
the return value of `LiveIntervals::shrinkToUses`.
`splitSeparateComponents` won't do anything unless there are multiple
unconnected components to split.
---
This is one of the bugs reported in #126916.
---
.../WebAssembly/WebAssemblyRegStackify.cpp | 9 ++--
.../WebAssembly/tee-live-intervals.mir | 41 +++++++++++++++++++
2 files changed, 46 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/CodeGen/WebAssembly/tee-live-intervals.mir
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
index 428d573668fb4..5ff3d5f760204 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp
@@ -510,10 +510,11 @@ static unsigned getTeeOpcode(const TargetRegisterClass *RC) {
// Shrink LI to its uses, cleaning up LI.
static void shrinkToUses(LiveInterval &LI, LiveIntervals &LIS) {
- if (LIS.shrinkToUses(&LI)) {
- SmallVector<LiveInterval *, 4> SplitLIs;
- LIS.splitSeparateComponents(LI, SplitLIs);
- }
+ LIS.shrinkToUses(&LI);
+ // In case the register's live interval now has multiple unconnected
+ // components, split them into multiple registers.
+ SmallVector<LiveInterval *, 4> SplitLIs;
+ LIS.splitSeparateComponents(LI, SplitLIs);
}
/// A single-use def in the same block with no intervening memory or register
diff --git a/llvm/test/CodeGen/WebAssembly/tee-live-intervals.mir b/llvm/test/CodeGen/WebAssembly/tee-live-intervals.mir
new file mode 100644
index 0000000000000..b137e990932b0
--- /dev/null
+++ b/llvm/test/CodeGen/WebAssembly/tee-live-intervals.mir
@@ -0,0 +1,41 @@
+# RUN: llc -mtriple=wasm32-unknown-unknown -run-pass wasm-reg-stackify -verify-machineinstrs %s -o -
+
+# TEE generation in RegStackify can create virtual registers with LiveIntervals
+# with multiple disconnected segments, which is invalid in MachineVerifier. In
+# this test, '%0 = CALL @foo' will become a CALL and a TEE, which creates
+# unconnected split segments. This should be later split into multiple
+# registers. This test should not crash with -verify-machineinstrs, which checks
+# whether all segments within a register is connected. See ??? for the detailed
+# explanation.
+
+--- |
+ target triple = "wasm32-unknown-unknown"
+
+ declare ptr @foo(ptr returned)
+ define void @tee_live_intervals_test() {
+ ret void
+ }
+...
+---
+name: tee_live_intervals_test
+liveins:
+ - { reg: '$arguments' }
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $arguments
+ successors: %bb.1, %bb.2
+ %0:i32 = ARGUMENT_i32 0, implicit $arguments
+ %1:i32 = CONST_I32 0, implicit-def dead $arguments
+ BR_IF %bb.2, %1:i32, implicit-def dead $arguments
+
+ bb.1:
+ ; predecessors: %bb.0
+ %0:i32 = CALL @foo, %0:i32, implicit-def dead $arguments, implicit $sp32, implicit $sp64
+ STORE8_I32_A32 0, 0, %0:i32, %1:i32, implicit-def dead $arguments
+ RETURN %0:i32, implicit-def dead $arguments
+
+ bb.2:
+ ; predecessors: %bb.0
+ %2:i32 = CONST_I32 0, implicit-def dead $arguments
+ RETURN %2:i32, implicit-def dead $arguments
More information about the llvm-commits
mailing list