[llvm] ea88384 - [RISCV][InsertVSETVLI] Remove redundant vsetvli by coalescing blocks from bottom up (#141298)

via llvm-commits llvm-commits at lists.llvm.org
Tue May 27 12:15:40 PDT 2025


Author: Min-Yih Hsu
Date: 2025-05-27T12:15:37-07:00
New Revision: ea8838446678a1163b361b0598b1259e9f476900

URL: https://github.com/llvm/llvm-project/commit/ea8838446678a1163b361b0598b1259e9f476900
DIFF: https://github.com/llvm/llvm-project/commit/ea8838446678a1163b361b0598b1259e9f476900.diff

LOG: [RISCV][InsertVSETVLI] Remove redundant vsetvli by coalescing blocks from bottom up (#141298)

I ran into a relatively rare case in RISCVInsertVSETVLIPass, where right
after the `emitVSETVLI` phase but before the `coalesceVSETVLIs` phase,
we have two blocks that look like this:
```
bb.first:
  %46:gprnox0 = PseudoVSETIVLI %30:gprnox0, 199 /* e8, mf2, ta, ma */, implicit-def $vl, implicit-def $vtype
  %76:gpr = PseudoVSETVLIX0 killed $x0, ..., implicit-def $vl, implicit-def $vtype
  $v10m2 = PseudoVMV_V_I_M2 undef renamable $v10m2, 0, -1, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
...
bb.second:
  $x0 = PseudoVSETVLI %46, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
  $v10 = PseudoVMV_S_X undef $v10(tied-def 0), undef %53:gpr, $noreg, 5, implicit $vl, implicit $vtype
  $x0 = PseudoVSETVLI %30, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
  $v8 = PseudoVREDSUM_VS_M2_E32 undef $v8(tied-def 0), killed $v8m2, killed $v10, $noreg, 5, 0, implicit $vl, implicit $vtype
```

After the `coalesceVSETVLIs` phase, it turns into:
``` diff
bb.first:
-  %46:gprnox0 = PseudoVSETIVLI %30:gprnox0, 199 /* e8, mf2, ta, ma */, implicit-def $vl, implicit-def $vtype
+  dead %46:gprnox0 = PseudoVSETIVLI %30:gprnox0, 199 /* e8, mf2, ta, ma */, implicit-def $vl, implicit-def $vtype
  %76:gpr = PseudoVSETVLIX0 killed $x0, ..., implicit-def $vl, implicit-def $vtype
  $v10m2 = PseudoVMV_V_I_M2 undef renamable $v10m2, 0, -1, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
...
bb.second:
-  $x0 = PseudoVSETVLI %46, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
+  $x0 = PseudoVSETVLI %30, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
  $v10 = PseudoVMV_S_X undef $v10(tied-def 0), undef %53:gpr, $noreg, 5, implicit $vl, implicit $vtype
- $x0 = PseudoVSETVLI %30, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
  $v8 = PseudoVREDSUM_VS_M2_E32 undef $v8(tied-def 0), killed $v8m2, killed $v10, $noreg, 5, 0, implicit $vl, implicit $vtype
```
We forwarded `%30` to any use of `%46` and further reduced the number of
VSETVLI we need in `bb.second`. But the problem is, if `bb.first` is
processed before `bb.second` -- which is the majority of the cases --
then we're not able to remove the vsetvli which defines the now-dead
`%46` in `bb.first` after coalescing `bb.second`.

This will produce assembly code like this:
```
        vsetvli zero, s0, e8, mf2, ta, ma
        vsetvli a0, zero, e32, m2, ta, ma
        vmv.v.i v10, 0
```

This patch fixes this issue by coalescing the blocks from bottom up such
that we can account for dead VSETVLI in the earlier blocks after its
uses are eliminated in later blocks.

---------

Co-authored-by: Luke Lau <luke at igalia.com>

Added: 
    llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-coalesce.mir

Modified: 
    llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index 2d79ced1cc163..8fe8dfabee297 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -26,6 +26,7 @@
 
 #include "RISCV.h"
 #include "RISCVSubtarget.h"
+#include "llvm/ADT/PostOrderIterator.h"
 #include "llvm/ADT/Statistic.h"
 #include "llvm/CodeGen/LiveDebugVariables.h"
 #include "llvm/CodeGen/LiveIntervals.h"
@@ -1840,8 +1841,11 @@ bool RISCVInsertVSETVLI::runOnMachineFunction(MachineFunction &MF) {
   // any cross block analysis within the dataflow.  We can't have both
   // demanded fields based mutation and non-local analysis in the
   // dataflow at the same time without introducing inconsistencies.
-  for (MachineBasicBlock &MBB : MF)
-    coalesceVSETVLIs(MBB);
+  // We're visiting blocks from the bottom up because a VSETVLI in the
+  // earlier block might become dead when its uses in later blocks are
+  // optimized away.
+  for (MachineBasicBlock *MBB : post_order(&MF))
+    coalesceVSETVLIs(*MBB);
 
   // Insert PseudoReadVL after VLEFF/VLSEGFF and replace it with the vl output
   // of VLEFF/VLSEGFF.

diff  --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-coalesce.mir b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-coalesce.mir
new file mode 100644
index 0000000000000..6f052d6377cf6
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-coalesce.mir
@@ -0,0 +1,69 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=riscv64 -mattr=+v -run-pass=liveintervals,riscv-insert-vsetvli %s -o - | FileCheck %s
+
+---
+name:            coalesce
+tracksRegLiveness: true
+noPhis:          true
+body:             |
+  ; CHECK-LABEL: name: coalesce
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[DEF:%[0-9]+]]:gprnox0 = IMPLICIT_DEF
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   dead [[PseudoVSETVLIX0_:%[0-9]+]]:gpr = PseudoVSETVLIX0 killed $x0, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
+  ; CHECK-NEXT:   renamable $v10m2 = PseudoVMV_V_I_M2 undef renamable $v10m2, 0, -1, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   successors: %bb.3(0x04000000), %bb.2(0x7c000000)
+  ; CHECK-NEXT:   liveins: $v10m2, $v12m2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   BEQ undef %2:gpr, $x0, %bb.2
+  ; CHECK-NEXT:   PseudoBR %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3:
+  ; CHECK-NEXT:   successors: %bb.1(0x7c000000), %bb.4(0x04000000)
+  ; CHECK-NEXT:   liveins: $v8m2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   $x0 = PseudoVSETVLI [[DEF]], 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
+  ; CHECK-NEXT:   renamable $v10 = PseudoVMV_S_X undef renamable $v10, undef %2:gpr, $noreg, 5 /* e32 */, implicit $vl, implicit $vtype
+  ; CHECK-NEXT:   dead renamable $v8 = PseudoVREDSUM_VS_M2_E32 undef renamable $v8, killed undef renamable $v8m2, killed undef renamable $v10, $noreg, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
+  ; CHECK-NEXT:   BNE undef %3:gpr, $x0, %bb.1
+  ; CHECK-NEXT:   PseudoBR %bb.4
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.4:
+  ; CHECK-NEXT:   PseudoRET
+  bb.0:
+    successors: %bb.1(0x80000000)
+
+    %78:gprnox0 = IMPLICIT_DEF
+
+  bb.1:
+    successors: %bb.2(0x80000000)
+
+    %46:gprnox0 = PseudoVSETVLI %78, 199 /* e8, mf2, ta, ma */, implicit-def dead $vl, implicit-def dead $vtype
+    renamable $v10m2 = PseudoVMV_V_I_M2 undef renamable $v10m2, 0, -1, 5 /* e32 */, 0 /* tu, mu */
+
+  bb.2:
+    successors: %bb.3(0x04000000), %bb.2(0x7c000000)
+    liveins: $v10m2, $v12m2
+
+    BEQ undef %54:gpr, $x0, %bb.2
+    PseudoBR %bb.3
+
+  bb.3:
+    successors: %bb.1(0x7c000000), %bb.4(0x04000000)
+    liveins: $v8m2
+
+    renamable $v10 = PseudoVMV_S_X undef renamable $v10, undef %54:gpr, %46, 5 /* e32 */
+    dead renamable $v8 = PseudoVREDSUM_VS_M2_E32 undef renamable $v8, killed undef renamable $v8m2, killed undef renamable $v10, %46, 5 /* e32 */, 0 /* tu, mu */
+    BNE undef %29:gpr, $x0, %bb.1
+    PseudoBR %bb.4
+
+  bb.4:
+    PseudoRET
+...


        


More information about the llvm-commits mailing list