# [llvm-dev] Canonicalize induction variables

Michael Kruse via llvm-dev llvm-dev at lists.llvm.org
Wed Aug 31 07:10:17 PDT 2016

```Hi again,

I looked into the code of IndVarSimplify but found no code that
changes the start value of the induction variable. It only modifies
the exit test.

However, the file's comment gives an example:

//   1. The exit condition for the loop is canonicalized to compare the
//      induction value against the exit value.  This turns loops like:
//        'for (i = 7; i*i < 1000; ++i)' into 'for (i = 0; i != 25; ++i)'

I ran the example and it turns the loop into
for (i = 7; i != 32; ++i)

The part that makes induction variables start at zero may have been removed.

Michael

2016-08-25 22:30 GMT+02:00 Ehsan Amiri <ehsanamiri at gmail.com>:
> But even for a very simple loop:
>
> int test1 (int *x, int *y, int *z, int k) {
>   int sum = 0;
>   for (int i = 10; i < k; i++) {
>     z[i] = x[i] / y[i];
>   }
>   return sum;
> }
>
> The initial value of induction variable is not zero after compiling with -O3
> -mcpu=power8 x.cpp -S -c -emit-llvm -fno-unroll-loops (see bottom of the
> email for IR)
>
> Also I can write somewhat more complicated loop where step size is a
> constant > 1, and the conditions are so that IV will not overflow:
>
> int test2 (int *x, int *y, int *z, int k) {
>   int sum = 0;
>   for (int i = 10; i < k && i < k-5; i+=5) {
>     z[i] = x[i] / y[i];
>   }
>   return sum;
> }
>
> again this is not canonicalized in the above sense (see IR at the end of the
> email). Maybe this condition is too complicated?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> IR for test1
>
>
> for.body:                                         ; preds =
>   %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 10,
>   %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvars.iv
>   %0 = load i32, i32* %arrayidx, align 4, !tbaa !1
>   %arrayidx2 = getelementptr inbounds i32, i32* %y, i64 %indvars.iv
>   %1 = load i32, i32* %arrayidx2, align 4, !tbaa !1
>   %div = sdiv i32 %0, %1
>   %arrayidx4 = getelementptr inbounds i32, i32* %z, i64 %indvars.iv
>   store i32 %div, i32* %arrayidx4, align 4, !tbaa !1
>   %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
>   %exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
>   br i1 %exitcond, label %for.cond.cleanup, label %for.body
>
> IR for test2
>
> for.body:                                         ; preds =
>   %indvars.iv = phi i64 [ 10, %for.body.preheader ], [ %indvars.iv.next,
> %for.body ]
>   %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvars.iv
>   %2 = load i32, i32* %arrayidx, align 4, !tbaa !1
>   %arrayidx3 = getelementptr inbounds i32, i32* %y, i64 %indvars.iv
>   %3 = load i32, i32* %arrayidx3, align 4, !tbaa !1
>   %div = sdiv i32 %2, %3
>   %arrayidx5 = getelementptr inbounds i32, i32* %z, i64 %indvars.iv
>   store i32 %div, i32* %arrayidx5, align 4, !tbaa !1
>   %indvars.iv.next = add nuw i64 %indvars.iv, 5
>   %cmp = icmp slt i64 %indvars.iv.next, %1
>   %cmp1 = icmp slt i64 %indvars.iv.next, %0
>   %or.cond = and i1 %cmp, %cmp1
>   br i1 %or.cond, label %for.body, label %for.cond.cleanup.loopexit
>
>
>
> On Thu, Aug 25, 2016 at 4:02 PM, Michael Kruse via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> Not sure whether these are the actual reasons, but to explain the
>> difficulties with those loops.
>>
>> 2016-08-25 3:48 GMT+02:00 Yaoqing Gao via llvm-dev
>> <llvm-dev at lists.llvm.org>:
>> > I just subscribed this group.  This is my first time to post a question
>> > (not
>> > sure if this is a right place for discussion) after I have a brief look
>> > at
>> > LLVM OPT (dev trunk).   I would expect loop simplification and induction
>> > variable canonicalization pass (IndVarSimplify pass) should be able to
>> > convert the following loops into a simple canonical form, i.e., there is
>> > a
>> > canonical induction variable which starts at zero and steps by one,
>> > getCanonicalInductionVariable() returns  the first PHI node in the loop
>> >
>> > int test1 (int x[], int k, int s) {
>> >   int sum = 0;
>> >   for (int i = 0; i < k; i+=s) {
>> >     sum += x[i];
>> >   }
>> >   return sum;
>> > }
>>
>> s could be zero making this an endless loop (C has some rules saying
>> that it can assume that certain loops do terminate, but I think it
>> does not apply to LLVM IR)
>>
>>
>> > int test2(int x[], int k, int s) {
>> >   int sum = 0;
>> >   for (int i = k; i > 0; i--) {
>> >     sum += x[i];
>> >   }
>> >   return sum;
>> > }
>>
>> with k = INT_MIN where is no upper limit in that range. Neither
>>
>> for (int j = 0; j < -INT_MIN; j++)
>>
>> nor
>>
>> for (int j = 0; j <= INT_MAX; j++)
>>
>> do work here.
>>
>> >
>> > Anyone can help explain why the current LLVM cannot canonicalize
>> > induction
>> > variables for the above loops (by design or a limitation to be fixed in
>> > the
>> > future)?  Thanks.
>>
>> The first could be tackled with loop versioning of the s==0 case. The
>> second might be converted to
>>
>> for (int j = -1; j < -(k+1); j++)
>>
>> although this isn't the canonical form.
>>
>>
>> Michael
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
```