[PATCH] D115261: [LV] Disable runtime unrolling for vectorized loops.
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 3 06:49:45 PST 2023
lebedev.ri accepted this revision.
lebedev.ri added a comment.
Hmm.
LV: IC is 4
LV: VF is 8
LV: Interleaving to saturate store or load ports.
LV: Minimum required TC for runtime checks to be profitable:0
LV: Found a vectorizable loop (8) in <stdin>
LV: Interleave Count is 4
LEV: Epilogue vectorization is not profitable for this loop
Executing best plan with VF=8, UF=4
LV: vectorizing VPBB:vector.ph in BB:vector.ph
LV: filled BB:
vector.ph: ; preds = %.lr.ph.preheader
%n.mod.vf = urem i64 %3, 32
%n.vec = sub i64 %3, %n.mod.vf
br label %middle.block
LV: VPBlock in RPO vector.body
LV: created vector.body
LV: draw edge fromvector.ph
LV: vectorizing VPBB:vector.body in BB:vector.body
LV: filled BB:
vector.body: ; preds = %vector.body, %vector.ph
%index = phi i64 [ 0, %vector.ph ]
%4 = add i64 %index, 0
%5 = add i64 %index, 8
%6 = add i64 %index, 16
%7 = add i64 %index, 24
%8 = getelementptr inbounds i32, ptr %0, i64 %4
%9 = getelementptr inbounds i32, ptr %0, i64 %5
%10 = getelementptr inbounds i32, ptr %0, i64 %6
%11 = getelementptr inbounds i32, ptr %0, i64 %7
%12 = getelementptr inbounds i32, ptr %8, i32 0
%wide.load = load <8 x i32>, ptr %12, align 4, !tbaa !5
%13 = getelementptr inbounds i32, ptr %8, i32 8
%wide.load7 = load <8 x i32>, ptr %13, align 4, !tbaa !5
%14 = getelementptr inbounds i32, ptr %8, i32 16
%wide.load8 = load <8 x i32>, ptr %14, align 4, !tbaa !5
%15 = getelementptr inbounds i32, ptr %8, i32 24
%wide.load9 = load <8 x i32>, ptr %15, align 4, !tbaa !5
%16 = mul nsw <8 x i32> %wide.load, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
%17 = mul nsw <8 x i32> %wide.load7, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
%18 = mul nsw <8 x i32> %wide.load8, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
%19 = mul nsw <8 x i32> %wide.load9, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
%20 = getelementptr inbounds i32, ptr %8, i32 0
store <8 x i32> %16, ptr %20, align 4, !tbaa !5
%21 = getelementptr inbounds i32, ptr %8, i32 8
store <8 x i32> %17, ptr %21, align 4, !tbaa !5
%22 = getelementptr inbounds i32, ptr %8, i32 16
store <8 x i32> %18, ptr %22, align 4, !tbaa !5
%23 = getelementptr inbounds i32, ptr %8, i32 24
store <8 x i32> %19, ptr %23, align 4, !tbaa !5
%index.next = add nuw i64 %index, 32
%24 = icmp eq i64 %index.next, %n.vec
br i1 %24, <null operand!>, label %vector.body
LV: vectorizing VPBB:middle.block in BB:middle.block
So i *was* thinking of something else.
It's possible that LV's unroll heuristic
may need further tuning, but in general
please proceed with this.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115261/new/
https://reviews.llvm.org/D115261
More information about the llvm-commits
mailing list