[LLVMbugs] [Bug 20850] New: should be able to simplify vector loop induction variables

Thu Sep 4 15:57:24 PDT 2014

http://llvm.org/bugs/show_bug.cgi?id=20850

            Bug ID: 20850
           Summary: should be able to simplify vector loop induction
                    variables
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Loop Optimizer
          Assignee: unassignedbugs at nondot.org
          Reporter: richard-llvm at metafoo.co.uk
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Consider this testcase from PR20849:

#include <stdio.h>

int main() {
    int i;
    unsigned char in[1000];

    for (i = 0; i < 1000; i++)
        in[i] = i % 127;

    for (i = 0; i < 1000; i++)
        printf("%d\n", in[i]);
    return 0;
}

The first loop produces the following IR at -O3:

entry:
  %in = alloca [1000 x i8], align 16
  %0 = getelementptr inbounds [1000 x i8]* %in, i64 0, i64 0
  call void @llvm.lifetime.start(i64 1000, i8* %0) #1
  br label %vector.body

vector.body:                                      ; preds = %vector.body,
%entry
  %index = phi i64 [ 0, %entry ], [ %index.next, %vector.body ]
  %1 = trunc i64 %index to i32
  %broadcast.splatinsert22 = insertelement <4 x i32> undef, i32 %1, i32 0
  %broadcast.splat23 = shufflevector <4 x i32> %broadcast.splatinsert22, <4 x
i32> undef, <4 x i32> zeroinitializer
  %induction24 = add <4 x i32> %broadcast.splat23, <i32 0, i32 1, i32 2, i32 3>
  %induction25 = add <4 x i32> %broadcast.splat23, <i32 4, i32 5, i32 6, i32 7>
  %2 = srem <4 x i32> %induction24, <i32 127, i32 127, i32 127, i32 127>
  %3 = srem <4 x i32> %induction25, <i32 127, i32 127, i32 127, i32 127>
  %4 = trunc <4 x i32> %2 to <4 x i8>
  %5 = trunc <4 x i32> %3 to <4 x i8>
  %6 = getelementptr inbounds [1000 x i8]* %in, i64 0, i64 %index
  %7 = bitcast i8* %6 to <4 x i8>*
  store <4 x i8> %4, <4 x i8>* %7, align 8, !tbaa !1
  %.sum26 = or i64 %index, 4
  %8 = getelementptr [1000 x i8]* %in, i64 0, i64 %.sum26
  %9 = bitcast i8* %8 to <4 x i8>*
  store <4 x i8> %5, <4 x i8>* %9, align 4, !tbaa !1
  %index.next = add i64 %index, 8
  %10 = icmp eq i64 %index.next, 1000
  br i1 %10, label %for.body3.preheader, label %vector.body, !llvm.loop !4

This is dumb. Instead of a broadcast and two adds in each iteration of the
loop, we should rewrite the loop to start at <i32 0, i32 1, i32 2, i32 3> and
<i32 4, i32 5, i32 6, i32 7>, and add 8 to each element on each iteration.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20140904/42f404d5/attachment.html>