[PATCH] D130442: [RISCV] Peephole optimization to fold merge.vvm and unmasked intrinsics.
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 2 12:24:58 PDT 2022
craig.topper added inline comments.
================
Comment at: llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp:2610
+ CurDAG->getMachineNode(MaskedOpc, DL, True->getVTList(), Ops);
+ ReplaceUses(N, Result);
+
----------------
craig.topper wrote:
> This does not handle the chain output of True correctly if it has users. We would need to replace True.getValue(1) with N->getValue(1). ReplaceUses will only replace the direct users of N.
Simple test case
```
define void @vpmerge_vpload_store(<vscale x 2 x i32> %passthru, <vscale x 2 x i32> * %p, <vscale x 2 x i1> %m, i32 zeroext %vl) {
; CHECK-LABEL: vpmerge_vpload_store:
; CHECK: # %bb.0:
; CHECK-NEXT: vsetvli zero, a1, e32, m1, tu, mu
; CHECK-NEXT: vle32.v v8, (a0), v0.t
; CHECK-NEXT: vs1r.v v8, (a0)
; CHECK-NEXT: ret
%splat = insertelement <vscale x 2 x i1> poison, i1 -1, i32 0
%mask = shufflevector <vscale x 2 x i1> %splat, <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer
%a = call <vscale x 2 x i32> @llvm.vp.load.nxv2i32.p0nxv2i32(<vscale x 2 x i32> * %p, <vscale x 2 x i1> %mask, i32 %vl)
%b = call <vscale x 2 x i32> @llvm.vp.merge.nxv2i32(<vscale x 2 x i1> %m, <vscale x 2 x i32> %a, <vscale x 2 x i32> %passthru, i32 %vl)
store <vscale x 2 x i32> %b, <vscale x 2 x i32> * %p
ret void
}
```
Right after isel the MachineIR is
```
# Machine code for function vpmerge_vpload_store: IsSSA, TracksLiveness
Function Live Ins: $v8 in %0, $x10 in %1, $v0 in %2, $x11 in %3
bb.0 (%ir-block.0):
liveins: $v8, $x10, $v0, $x11
%3:gprnox0 = COPY $x11
%2:vr = COPY $v0
%1:gpr = COPY $x10
%0:vrnov0 = COPY $v8
%4:vr = PseudoVLE32_V_M1 %1:gpr, %3:gprnox0, 5 :: (load unknown-size from %ir.p, align 8)
$v0 = COPY %2:vr
%5:vrnov0 = PseudoVLE32_V_M1_MASK %0:vrnov0(tied-def 0), %1:gpr, $v0, %3:gprnox0, 5, 0
VS1R_V killed %5:vrnov0, %1:gpr :: (store unknown-size into %ir.p, align 8)
PseudoRET
# End machine code for function vpmerge_vpload_store.
```
Notice the two VLEs. Dead code elimination will eventually delete the extra one, but it shouldn't have to. For a more complex test we might put loads and stores in the wrong order in the MachineIR.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130442/new/
https://reviews.llvm.org/D130442
More information about the llvm-commits
mailing list