[PATCH] D129735: [WIP][RISCV] Add new pass to transform undef to pesudo for vector values.
Piyou Chen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 20 23:13:35 PDT 2022
BeMg marked an inline comment as done and an inline comment as not done.
BeMg added a comment.
In D129735#3871632 <https://reviews.llvm.org/D129735#3871632>, @craig.topper wrote:
> Does this patch work for this test case
>
> define internal void @foo() {
> loopIR.preheader.i.i:
> %v15 = tail call <vscale x 1 x i16> @llvm.experimental.stepvector.nxv1i16()
> %v17 = tail call <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16> poison, <vscale x 1 x i16> %v15, i64 0)
> %vs12.i.i.i = add <vscale x 1 x i16> %v15, shufflevector (<vscale x 1 x i16> insertelement (<vscale x 1 x i16> poison, i16 1, i32 0), <vscale x 1 x i16> poison, <vscale x 1 x i32> zeroinitializer)
> %v18 = tail call <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16> poison, <vscale x 1 x i16> %vs12.i.i.i, i64 0)
> %vs16.i.i.i = add <vscale x 1 x i16> %v15, shufflevector (<vscale x 1 x i16> insertelement (<vscale x 1 x i16> poison, i16 3, i32 0), <vscale x 1 x i16> poison, <vscale x 1 x i32> zeroinitializer)
> %v20 = tail call <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16> poison, <vscale x 1 x i16> %vs16.i.i.i, i64 0)
> br label %loopIR3.i.i
>
> loopIR3.i.i: ; preds = %loopIR3.i.i, %loopIR.preheader.i.i
> %v37 = load <vscale x 8 x i8>, ptr addrspace(1) null, align 8
> %v38 = tail call <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %v37, <vscale x 8 x i16> %v17, i64 4)
> %v40 = tail call <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %v37, <vscale x 8 x i16> %v18, i64 4)
> %v42 = and <vscale x 8 x i8> %v38, %v40
> %v46 = tail call <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %v37, <vscale x 8 x i16> %v20, i64 4)
> %v60 = and <vscale x 8 x i8> %v42, %v46
> store <vscale x 8 x i8> %v60, ptr addrspace(1) null, align 4
> br label %loopIR3.i.i
> }
>
> declare <vscale x 1 x i16> @llvm.experimental.stepvector.nxv1i16()
>
> declare <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16>, <vscale x 1 x i16>, i64 immarg)
>
> declare <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8>, <vscale x 8 x i8>, <vscale x 8 x i16>, i64)
This IR doesn't generate the undef+early-clobber situation, so this pass will not work on it.
RA result will not break the early-clobber constraint in current compiler.
Corrsponding MachineInst before RA place below:
- MACHINEINSTRS **********
1. Machine code for function foo: NoPHIs, TracksLiveness, TiedOpsRewritten, TracksDebugUserValues
0B bb.0.loopIR.preheader.i.i:
successors: %bb.1(0x80000000); %bb.1(100.00%)
16B dead %16:gpr = PseudoVSETVLIX0 $x0, 206, implicit-def $vl, implicit-def $vtype
32B undef %0.sub_vrm1_0:vrm2 = PseudoVID_V_MF4 -1, 4, implicit $vl, implicit $vtype
64B undef %1.sub_vrm1_0:vrm2 = PseudoVADD_VI_MF4 %0.sub_vrm1_0:vrm2, 1, -1, 4, implicit $vl, implicit $vtype
96B undef %2.sub_vrm1_0:vrm2 = PseudoVADD_VI_MF4 %0.sub_vrm1_0:vrm2, 3, -1, 4, implicit $vl, implicit $vtype
128B bb.1.loopIR3.i.i:
; predecessors: %bb.0, %bb.1
successors: %bb.1(0x80000000); %bb.1(100.00%)
160B %10:vr = VL1RE8_V $x0 :: (load unknown-size from `ptr addrspace(1) null`, align 8, addrspace 1)
176B dead $x0 = PseudoVSETIVLI 4, 192, implicit-def $vl, implicit-def $vtype
192B early-clobber %11:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %0:vrm2, 4, 3, implicit $vl, implicit $vtype
208B early-clobber %12:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %1:vrm2, 4, 3, implicit $vl, implicit $vtype
224B dead %17:gpr = PseudoVSETVLIX0 $x0, 192, implicit-def $vl, implicit-def $vtype
240B %13:vr = PseudoVAND_VV_M1 %11:vr, %12:vr, -1, 3, implicit $vl, implicit $vtype
256B dead $x0 = PseudoVSETIVLI 4, 192, implicit-def $vl, implicit-def $vtype
272B early-clobber %14:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %2:vrm2, 4, 3, implicit $vl, implicit $vtype
288B dead %18:gpr = PseudoVSETVLIX0 $x0, 192, implicit-def $vl, implicit-def $vtype
304B %15:vr = PseudoVAND_VV_M1 %13:vr, %14:vr, -1, 3, implicit $vl, implicit $vtype
320B VS1R_V %15:vr, $x0 :: (store unknown-size into `ptr addrspace(1) null`, align 4, addrspace 1)
336B PseudoBR %bb.1
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129735/new/
https://reviews.llvm.org/D129735
More information about the llvm-commits
mailing list