[llvm] [llvm-mca][RISC-V] Remove duplicated use of SP from `c.addi4spn` (PR #189980)

Wed Apr 1 08:36:06 PDT 2026

https://github.com/giorgio-marletta created https://github.com/llvm/llvm-project/pull/189980

`c.addi4spn` instruction implicitly uses the X2 (SP) register, but in
addition to being present in the Uses list, it is also modeled as an
explicit operand with the SP register class. This duplication causes
missed bypasses in llvm-mca when the instruction needs to read the SP
value written by a previous instruction.

For example, on a `sifive-u74` CPU, the following timeline excerpt
shows that the `c.addi4spn` is issues 2 cycles later than expected by
the GPR bypass:
```
Timeline view:
Index     012345678

[0,0]     DeeE .  .   mv	sp, a0
[0,1]     .  DeeE .   addi	a1, sp, 12
```

This patch removes SP from the Uses list, relying solely on the
explicit SP operand (as in `c.addi16sp`), which restores the expected
bypass behavior.

A test is added that checks the same scenario for `c.addi16sp` as well,
since a similar issue may also occur there.

>From 3d6113137ba72245c690e136a8f79f8330035111 Mon Sep 17 00:00:00 2001
From: Giorgio MARLETTA <giorgio.marletta at st.com>
Date: Wed, 1 Apr 2026 17:24:58 +0200
Subject: [PATCH] [llvm-mca][RISC-V] Remove duplicated use of SP from
 `c.addi4spn`

`c.addi4spn` instruction implicitly uses the X2 (SP) register, but in
addition to being present in the Uses list, it is also modeled as an
explicit operand with the SP register class. This duplication causes
missed bypasses in llvm-mca when the instruction needs to read the SP
value written by a previous instruction.

For example, on a `sifive-u74` CPU, the following timeline excerpt
shows that the `c.addi4spn` is issues 2 cycles later than expected by
the GPR bypass:
```
Timeline view:
Index     012345678

[0,0]     DeeE .  .   mv	sp, a0
[0,1]     .  DeeE .   addi	a1, sp, 12
```

This patch removes SP from the Uses list, relying solely on the
explicit SP operand (as in `c.addi16sp`), which restores the expected
bypass behavior.

A test is added that checks the same scenario for `c.addi16sp` as well,
since a similar issue may also occur there.
---
 llvm/lib/Target/RISCV/RISCVInstrInfoC.td      |  2 +-
 .../tools/llvm-mca/RISCV/SiFive7/sp-bypass.s  | 81 +++++++++++++++++++
 2 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/tools/llvm-mca/RISCV/SiFive7/sp-bypass.s

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoC.td b/llvm/lib/Target/RISCV/RISCVInstrInfoC.td
index 8f76fa3b5bfd3..66648a8bea82f 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoC.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoC.td
@@ -266,7 +266,7 @@ class CA_ALU<bits<6> funct6, bits<2> funct2, string OpcodeStr>
 
 let Predicates = [HasStdExtZca] in {
 
-let hasSideEffects = 0, mayLoad = 0, mayStore = 0, Uses = [X2] in
+let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
 def C_ADDI4SPN : RVInst16CIW<0b000, OPC_C0, (outs GPRC:$rd),
                              (ins SP:$rs1, uimm10_lsb00nonzero:$imm),
                              "c.addi4spn", "$rd, $rs1, $imm">,
diff --git a/llvm/test/tools/llvm-mca/RISCV/SiFive7/sp-bypass.s b/llvm/test/tools/llvm-mca/RISCV/SiFive7/sp-bypass.s
new file mode 100644
index 0000000000000..c9a7edcec6789
--- /dev/null
+++ b/llvm/test/tools/llvm-mca/RISCV/SiFive7/sp-bypass.s
@@ -0,0 +1,81 @@
+# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
+# RUN: llvm-mca -mtriple=riscv64 -mcpu=sifive-u74 -timeline -iterations=1 < %s \
+# RUN:   | FileCheck %s
+
+# Check that bypasses with SP operand work correctly
+
+c.mv sp, a0
+c.addi4spn a1, sp, 12
+addi sp, t0, 16
+c.addi16sp sp, -80
+addi t0, sp, 20
+
+# CHECK:      Iterations:        1
+# CHECK-NEXT: Instructions:      5
+# CHECK-NEXT: Total Cycles:      7
+# CHECK-NEXT: Total uOps:        5
+
+# CHECK:      Dispatch Width:    2
+# CHECK-NEXT: uOps Per Cycle:    0.71
+# CHECK-NEXT: IPC:               0.71
+# CHECK-NEXT: Block RThroughput: 2.5
+
+# CHECK:      Instruction Info:
+# CHECK-NEXT: [1]: #uOps
+# CHECK-NEXT: [2]: Latency
+# CHECK-NEXT: [3]: RThroughput
+# CHECK-NEXT: [4]: MayLoad
+# CHECK-NEXT: [5]: MayStore
+# CHECK-NEXT: [6]: HasSideEffects (U)
+
+# CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
+# CHECK-NEXT:  1      3     0.50                        mv	sp, a0
+# CHECK-NEXT:  1      3     0.50                        addi	a1, sp, 12
+# CHECK-NEXT:  1      3     0.50                        addi	sp, t0, 16
+# CHECK-NEXT:  1      3     0.50                        addi	sp, sp, -80
+# CHECK-NEXT:  1      3     0.50                        addi	t0, sp, 20
+
+# CHECK:      Resources:
+# CHECK-NEXT: [0]   - VLEN512SiFive7FDiv
+# CHECK-NEXT: [1]   - VLEN512SiFive7IDiv
+# CHECK-NEXT: [2]   - VLEN512SiFive7PipeA
+# CHECK-NEXT: [3]   - VLEN512SiFive7PipeB
+# CHECK-NEXT: [4]   - VLEN512SiFive7VA1
+# CHECK-NEXT: [5]   - VLEN512SiFive7VCQ
+# CHECK-NEXT: [6]   - VLEN512SiFive7VL
+# CHECK-NEXT: [7]   - VLEN512SiFive7VS
+
+# CHECK:      Resource pressure per iteration:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]
+# CHECK-NEXT:  -      -     2.00   3.00    -      -      -      -
+
+# CHECK:      Resource pressure by instruction:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    Instructions:
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     mv	sp, a0
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     addi	a1, sp, 12
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     addi	sp, t0, 16
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     addi	sp, sp, -80
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     addi	t0, sp, 20
+
+# CHECK:      Timeline view:
+# CHECK-NEXT: Index     0123456
+
+# CHECK:      [0,0]     DeeE ..   mv	sp, a0
+# CHECK-NEXT: [0,1]     .DeeE..   addi	a1, sp, 12
+# CHECK-NEXT: [0,2]     .DeeE..   addi	sp, t0, 16
+# CHECK-NEXT: [0,3]     . DeeE.   addi	sp, sp, -80
+# CHECK-NEXT: [0,4]     .  DeeE   addi	t0, sp, 20
+
+# CHECK:      Average Wait times (based on the timeline view):
+# CHECK-NEXT: [0]: Executions
+# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
+# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
+# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
+
+# CHECK:            [0]    [1]    [2]    [3]
+# CHECK-NEXT: 0.     1     0.0    0.0    0.0       mv	sp, a0
+# CHECK-NEXT: 1.     1     0.0    0.0    0.0       addi	a1, sp, 12
+# CHECK-NEXT: 2.     1     0.0    0.0    0.0       addi	sp, t0, 16
+# CHECK-NEXT: 3.     1     0.0    0.0    0.0       addi	sp, sp, -80
+# CHECK-NEXT: 4.     1     0.0    0.0    0.0       addi	t0, sp, 20
+# CHECK-NEXT:        1     0.0    0.0    0.0       <total>