[llvm] [RISCV] RISC-V split register allocation and move vsetvl pass in between (PR #70549)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 13 01:00:49 PST 2023
lukel97 wrote:
One small regression I noticed with the post-ra insertvsetvli pass is this function from `test/CodeGen/RISCV/rvv/fixed-vectors-mask-buildvec.ll`:
```llvm
define <4 x i1> @buildvec_mask_nonconst_v4i1(i1 %x, i1 %y) {
%1 = insertelement <4 x i1> poison, i1 %x, i32 0
%2 = insertelement <4 x i1> %1, i1 %x, i32 1
%3 = insertelement <4 x i1> %2, i1 %y, i32 2
%4 = insertelement <4 x i1> %3, i1 %y, i32 3
ret <4 x i1> %4
}
```
With pre-RA vsetvli insertion we have:
```asm
buildvec_mask_nonconst_v4i1: # @buildvec_mask_nonconst_v4i1
.cfi_startproc
# %bb.0:
vsetivli zero, 4, e8, mf4, ta, ma
vmv.v.i v0, 3
vmv.v.x v8, a1
vmerge.vxm v8, v8, a0, v0
vand.vi v8, v8, 1
vmsne.vi v0, v8, 0
ret
```
But post-RA insertion results in an extra vsetvli:
```asm
buildvec_mask_nonconst_v4i1: # @buildvec_mask_nonconst_v4i1
.cfi_startproc
# %bb.0:
vsetivli zero, 1, e8, mf8, ta, ma
vmv.v.i v0, 3
vsetivli zero, 4, e8, mf4, ta, ma
vmv.v.x v8, a1
vmerge.vxm v8, v8, a0, v0
vand.vi v8, v8, 1
vmsne.vi v0, v8, 0
ret
```
>From what I can tell this is due to how the vmv.v.i gets scheduled before the vmv.v.x, e.g. the BB goes from beginning like this:
```
bb.0 (%ir-block.0):
liveins: $x10, $x11
%1:gpr = COPY $x11
%0:gpr = COPY $x10
%2:vr = PseudoVMV_V_X_MF4 $noreg(tied-def 0), %1:gpr, 4, 3, 0
%3:vr = PseudoVMV_V_I_MF8 $noreg(tied-def 0), 3, 1, 3, 0
```
to this:
```
0B bb.0 (%ir-block.0):
liveins: $x10, $x11
16B %1:gpr = COPY $x11
32B %0:gpr = COPY $x10
64B renamable $v0 = PseudoVMV_V_I_MF8 undef renamable $v0(tied-def 0), 3, 1, 3, 0
80B renamable $v8 = PseudoVMV_V_X_MF4 undef renamable $v8(tied-def 0), %1:gpr, 4, 3, 0
```
We end up with the extra vsetvli because we have an optimisation where we avoid inserting a vsetvli if we know we have a vmv.v.i with VL=1 in needVSETVLI, which is in turn called after the first instruction in transferBefore when going from the vmv.v.x -> vmv.v.i. But because the vmv.v.i instruction is now first, we don't do the check going between the vmv.v.i -> vmv.v.x.
We could probably recover this case by teaching the backwards local postpass about this.
https://github.com/llvm/llvm-project/pull/70549
More information about the llvm-commits
mailing list