<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/90559>90559</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[RISCV] Miscompile with exact VLEN/vscale and memset
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:RISC-V,
miscompilation,
llvm:SelectionDAG
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
lukel97
</td>
</tr>
</table>
<pre>
After #70452 I noticed a miscompile on 502.gcc_r from SPEC CPU 2017 when compiling for rv64gv with -mrvv-vector-bits=zvl. A minimal reproducer involves a memset in a function with an exact vscale_range:
```llvm
define void @foo(ptr %p) vscale_range(2,2) {
%q = getelementptr inbounds i8, ptr %p, i64 84
%x = load i32, ptr %q
call void @llvm.memset.p0.i64(ptr %p, i8 0, i64 96, i1 false)
store i32 %x, ptr %q
ret void
}
```
The previous codegen has a `lw` and `sw` pair
```asm
foo:
lw a1, 84(a0)
addi a2, a0, 80
vsetivli zero, 16, e8, m1, ta, ma
vmv.v.i v8, 0
vs1r.v v8, (a2)
vsetvli a2, zero, e8, m4, ta, ma
vmv.v.i v12, 0
vs4r.v v12, (a0)
addi a2, a0, 64
vs1r.v v8, (a2)
sw a1, 84(a0)
ret
```
After #70452 these seem to be mistakenly detected as dead and are omitted:
```asm
foo:
vsetvli a1, zero, e8, m4, ta, ma
vmv.v.i v8, 0
vs4r.v v8, (a0)
addi a1, a0, 80
vsetivli zero, 16, e8, m1, ta, ma
vmv.v.i v8, 0
vs1r.v v8, (a1)
addi a0, a0, 64
vs1r.v v8, (a0)
ret
```
The offending combine seems to happen in the post-legalize combine
```diff
Legalized selection DAG: %bb.0 'foo:'
SelectionDAG has 28 nodes:
t0: ch,glue = EntryToken
@@ -9,32 +8,29 @@
t58: v16i8 = extract_subvector t57, Constant:i64<0>
t49: nxv8i8 = insert_subvector undef:nxv8i8, t58, Constant:i64<0>
t26: i64 = add t2, Constant:i64<80>
- t50: ch = store<(store (s128) into %ir.p + 80, align 1)> t44:1, t49, t26, undef:i64
+ t50: ch = store<(store (<vscale x 1 x s128>) into %ir.p + 80, align 1)> t44:1, t49, t26, undef:i64
t46: ch = store<(store (s32) into %ir.q), trunc to i32> t50, t44, t4, undef:i64
t60: nxv32i8 = RISCVISD::VMV_V_X_VL undef:nxv32i8, OpaqueConstant:i64<0>, Register:i64 $x0
t61: v64i8 = extract_subvector t60, Constant:i64<0>
t53: nxv32i8 = insert_subvector undef:nxv32i8, t61, Constant:i64<0>
- t54: ch = store<(store (s512) into %ir.p, align 1)> t0, t53, t2, undef:i64
+ t54: ch = store<(store (<vscale x 1 x s512>) into %ir.p, align 1)> t0, t53, t2, undef:i64
t23: i64 = add t2, Constant:i64<64>
- t51: ch = store<(store (s128) into %ir.p + 64, align 1)> t0, t49, t23, undef:i64
+ t51: ch = store<(store (<vscale x 1 x s128>) into %ir.p + 64, align 1)> t0, t49, t23, undef:i64
t43: ch = TokenFactor t46, t54, t51
t30: ch = RISCVISD::RET_GLUE t43
Optimized legalized selection DAG: %bb.0 'foo:'
-SelectionDAG has 23 nodes:
+SelectionDAG has 19 nodes:
t0: ch,glue = EntryToken
t2: i64,ch = CopyFromReg t0, Register:i64 %0
- t4: i64 = add nuw t2, Constant:i64<84>
- t44: i64,ch = load<(load (s32) from %ir.q), sext from i32> t0, t4, undef:i64
t57: nxv8i8 = RISCVISD::VMV_V_X_VL undef:nxv8i8, OpaqueConstant:i64<0>, Register:i64 $x0
- t26: i64 = add t2, Constant:i64<80>
- t50: ch = store<(store (s128) into %ir.p + 80, align 1)> t44:1, t57, t26, undef:i64
- t46: ch = store<(store (s32) into %ir.q), trunc to i32> t50, t44, t4, undef:i64
t60: nxv32i8 = RISCVISD::VMV_V_X_VL undef:nxv32i8, OpaqueConstant:i64<0>, Register:i64 $x0
- t54: ch = store<(store (s512) into %ir.p, align 1)> t0, t60, t2, undef:i64
+ t54: ch = store<(store (<vscale x 1 x s512>) into %ir.p, align 1)> t0, t60, t2, undef:i64
t23: i64 = add t2, Constant:i64<64>
- t51: ch = store<(store (s128) into %ir.p + 64, align 1)> t0, t57, t23, undef:i64
- t43: ch = TokenFactor t46, t54, t51
- t30: ch = RISCVISD::RET_GLUE t43
+ t51: ch = store<(store (<vscale x 1 x s128>) into %ir.p + 64, align 1)> t0, t57, t23, undef:i64
+ t26: i64 = add t2, Constant:i64<80>
+ t62: ch = store<(store (<vscale x 1 x s128>) into %ir.p + 80, align 1)> t0, t57, t26, undef:i64
+ t68: ch = TokenFactor t54, t51, t62
+ t30: ch = RISCVISD::RET_GLUE t68
```
As an aside the generated code for the memset is a bit strange, we should be able to do it with one LMUL=8 vse8.v with a VL of 96.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzMWF1v2zoS_TX0y8AGRUqy_OAHR06KAunei6YN9i2gxbHNrUS6IqWk_fULUvJHHMdxs-j2BoFli-TM4czhGWmEtWqlEackuSLJfCAatzb1tGy-YTkZDxZG_pjOlg5rIIyPaZww-AjaOFWgBAGVsoWpNqpEMBoSykaronioYVmbCu7-vs4h__srMBqN4XGNGrrJSq9gaWqo2zRetfCo3BqGVd22wxYLZ-rhQjlL-PxnW45gBpXSqhIl1LipjWwKrEHp1pQtWg8BK4sOlAYBy0YXThndmRQa8EkUDlpbiBIfaqFXSPiM0Dmh28-Udv9l2VbdLYlLpRFaoySQmC6NISzbOB-BZEPY5Lk5ljHCcubvk_FVZwH81O9A-BxW6LDECrXzFpRemEZLCyojLIe90RxUGkMWH6x_CutLIyQozg6mf99OKkRZ7mB6_KMuFqMNHak0foY6B5UB3TqapOFbBEtRWiRssjVpnanR-wsITjmt0QWfffjG86M4Hgb3yxphU2OrTGOhMBJXqGEtfNZ8xB9JSkFo6X_Y8GMjVH1kT9g-LT4P-9xNykdCJyLyEDO_VUF32_ADUip_CXETYd_ZDtqktehUW_oZP7E2fjQKEcGQlipYdSJ8F_tVVTtqR35RG6Yd2ovqUbsb8GjYIRrvr3PXAdo67d3Fb7mL2LG_uPfXjVyy_TS-GK49G9oa3ZmUH2mFW6NFsIgVOAML9ILhxDfU5Q-Q6LBwXkYsSBQyUEHUCKZSzqF87aS-xoiDKEfvifKLpMbHUToZ4-hPUSw6hYZemvFfyKk_xma5RC29cBemWnh99Em1Pqtrsdmg9gLs_Hk31g1LXIlS_cTt5JN5lGq57FXltp8vwWKJnYTPZx8In3ntWSxGFAgb9_lm437V3XbufPYhyArLQBuJdscKAHDUWynWhOWrssEgqtfa1T--mG-oe0gxJTGF4YSwPEjflQ-TV_QwsLMF4JLMm2ujVGXBFD65WhTuwTaLrnaBS8Y-xrnR1gntCJ95LeY5Jfx6jyqeeDv6qc16Q0pbrA_tNFrikvBZNyfQJcnetrz9cyz1Hrzae_NCSnDs5PJsv34ILunjFVaFckB4TljWVQb_JWKZr3ZKO-Ozo-rRxofMs99zr1QrDYGb_BpcHBM-68geT8KFhWOw3Z3asZRd7aC_iYHwvCvC8AQRPEHAxK9_A6wtpDh9KyycHbn_7p1503WjC39QfBn3zhPaOe4UKT7nt_ee0p4tnPV0-fzxLr__eDf3VOez-0_3D_cP_364vz2kjZ_sjf-1Ed8bPE0blsNnXCnrsO4GgLD4ib6kUxoF4qfx68RP6aX0dAk_3tA5_m834kGc9zDc2o_fSlcSHedrc4InXaYS3lHkPHHfdvqCth7ES9q-D8fByecXnnz_cRy46L3HP41fxb09YvytAL7l_FfO_bvxgIv5AY5QJ25ER_E47RLRHd0k2gs6P9Ss56fz8_WXhw-3X6-D4W7ByctfG6eqUAXL99TD4ct6yI_qIWFXLyZFk-OieUnJDDzraUZY3u87N5sfN7WpPuOqj_ULcUnojnAuPuapbh5frVLPuNoJ-HPn_lWp40t4adprcngZfa7JFp9cd38ryvS8GPuq_rxgX6LA2f8owMN_cj3vnnNeK5zDP104_1TR_D0lqKutZ0rQ768-ZyH8g6vPlqen1X7YsfQXFX_4q4r__y9y57d9-Lj9HmXZ7ydlv_thnV4gNz0el2avJXKfwUBmdrDuslSm2bm-hwWhQVglMbwGr1BjLRzK0PcK3U5_e9ustCBgoRxY13cRc3hEsGvTlBIWCGJRolc-aUC5rptpNMLtp6-3hM8zaC1mo75zKuD-FswSJuloIKdcTvhEDHAajaM4Tnia0cF6ipKnSco5Hy9onI0LFuE4y9JFJMaJlFQO1JRRFtOYU5rQjNJRkY5R0iSjE5ZQmoxJTLESqhyFRqOpVwNlbYPTCU2SyaAUCyxtaCIzthDFN9TSB-_jXT68J8zTiTC26xYL_wCyux1ar3x2-GTix5L5oJ76seGiWVkS01JZZ_f-nXJl6FuHhJFkDp_23egQma77e397_S_CbnoOCi37LAyaupyunduERx92Q9jNSrl1sxgVpiLsJsDqLsNNbf6DhSPsJuzaEnYTNv7fAAAA__9yUlau">