[PATCH] D130442: [RISCV] Peephole optimization to fold merge.vvm and unmasked intrinsics.
Yeting Kuo via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 3 00:42:15 PDT 2022
fakepaper56 marked 2 inline comments as done.
fakepaper56 added inline comments.
================
Comment at: llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp:2610
+ CurDAG->getMachineNode(MaskedOpc, DL, True->getVTList(), Ops);
+ ReplaceUses(N, Result);
+
----------------
craig.topper wrote:
> craig.topper wrote:
> > This does not handle the chain output of True correctly if it has users. We would need to replace True.getValue(1) with N->getValue(1). ReplaceUses will only replace the direct users of N.
> Simple test case
>
>
> ```
> define void @vpmerge_vpload_store(<vscale x 2 x i32> %passthru, <vscale x 2 x i32> * %p, <vscale x 2 x i1> %m, i32 zeroext %vl) {
> ; CHECK-LABEL: vpmerge_vpload_store:
> ; CHECK: # %bb.0:
> ; CHECK-NEXT: vsetvli zero, a1, e32, m1, tu, mu
> ; CHECK-NEXT: vle32.v v8, (a0), v0.t
> ; CHECK-NEXT: vs1r.v v8, (a0)
> ; CHECK-NEXT: ret
> %splat = insertelement <vscale x 2 x i1> poison, i1 -1, i32 0
> %mask = shufflevector <vscale x 2 x i1> %splat, <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer
> %a = call <vscale x 2 x i32> @llvm.vp.load.nxv2i32.p0nxv2i32(<vscale x 2 x i32> * %p, <vscale x 2 x i1> %mask, i32 %vl)
> %b = call <vscale x 2 x i32> @llvm.vp.merge.nxv2i32(<vscale x 2 x i1> %m, <vscale x 2 x i32> %a, <vscale x 2 x i32> %passthru, i32 %vl)
> store <vscale x 2 x i32> %b, <vscale x 2 x i32> * %p
> ret void
> }
> ```
>
> Right after isel the MachineIR is
>
> ```
> # Machine code for function vpmerge_vpload_store: IsSSA, TracksLiveness
> Function Live Ins: $v8 in %0, $x10 in %1, $v0 in %2, $x11 in %3
>
> bb.0 (%ir-block.0):
> liveins: $v8, $x10, $v0, $x11
> %3:gprnox0 = COPY $x11
> %2:vr = COPY $v0
> %1:gpr = COPY $x10
> %0:vrnov0 = COPY $v8
> %4:vr = PseudoVLE32_V_M1 %1:gpr, %3:gprnox0, 5 :: (load unknown-size from %ir.p, align 8)
> $v0 = COPY %2:vr
> %5:vrnov0 = PseudoVLE32_V_M1_MASK %0:vrnov0(tied-def 0), %1:gpr, $v0, %3:gprnox0, 5, 0
> VS1R_V killed %5:vrnov0, %1:gpr :: (store unknown-size into %ir.p, align 8)
> PseudoRET
>
> # End machine code for function vpmerge_vpload_store.
> ```
>
> Notice the two VLEs. Dead code elimination will eventually delete the extra one, but it shouldn't have to. For a more complex test we might put loads and stores in the wrong order in the MachineIR.
Done. Thank you find the bug.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130442/new/
https://reviews.llvm.org/D130442
More information about the llvm-commits
mailing list