[PATCH] D12765: [LV] Allow vectorization of loops with induction post-inc expressions

Michael Zolotukhin via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 23 15:56:36 PDT 2015


mzolotukhin added a comment.

Hi everyone,

Just ot give an update on this: I tried the patch I posted before, but I'm not yet satisfied with it. The problem seems to be that we're detecting similar expressions now, but we're not reusing them. For the record, here is the test I'm playing with:

  target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
  
  ; Function Attrs: noinline nounwind ssp uwtable
  define i8* @foo(i32 %y, i8* noalias %src, i8* noalias %dst) {
  entry:
    %cmp = icmp slt i32 %y, 4096
    %sub = add nsw i32 %y, -1
    %tripcount = select i1 %cmp, i32 %sub, i32 4095
    %loop.entry.cond = icmp sgt i32 %tripcount, 0
    br i1 %loop.entry.cond, label %loop.ph, label %loop.exit
  
  loop.ph:                                   ; preds = %entry
    br label %loop.body
  
  loop.body:                                         ; preds = %loop.body, %loop.ph
    %iv = phi i32 [ 0, %loop.ph ], [ %iv.next, %loop.body ]
    %src.iv = phi i8* [ %src, %loop.ph ], [ %src.iv.next, %loop.body ]
    %dst.iv = phi i8* [ %dst, %loop.ph ], [ %dst.iv.next, %loop.body ]
  
    %tmp = load i8, i8* %src.iv, align 1
    store i8 %tmp, i8* %dst.iv, align 1
  
    %src.iv.next = getelementptr inbounds i8, i8* %src.iv, i64 1
    %dst.iv.next = getelementptr inbounds i8, i8* %dst.iv, i64 1
    %iv.next = add nsw i32 %iv, 1
  
    %loop.cond = icmp slt i32 %iv.next, %tripcount
    br i1 %loop.cond, label %loop.body, label %loop.exit
  
  loop.exit:                                 ; preds = %loop.exit, %entry
    %src.iv.lcssa = phi i8* [ %src.iv.next, %loop.body ], [ %src, %entry ]
    ret i8* undef
  }
  
  !llvm.module.flags = !{!0, !1}
  !llvm.ident = !{!2}
  
  !0 = !{i32 2, !"Debug Info Version", i32 3}
  !1 = !{i32 1, !"PIC Level", i32 2}
  !2 = !{!"clang version 3.8.0 (trunk 247767) (llvm/trunk 247769)"}

What is expected from indvars on this test is to rewrite  `%src.iv.lcssa` with a loop-invariant value. The patch does that, but it fills the loop preheader with code similar to what we have in `entry:` basic block, which seems unnecessary. My plan is to investigate it further, and hopefully fix it.

Michael


Repository:
  rL LLVM

http://reviews.llvm.org/D12765





More information about the llvm-commits mailing list