<div dir="ltr"><div><div><div>But even for a very simple loop:<br><br>int test1 (int *x, int *y, int *z, int k) {<br> int sum = 0;<br> for (int i = 10; i < k; i++) {<br> z[i] = x[i] / y[i];<br> }<br> return sum;<br>}<br><br></div>The initial value of induction variable is not zero after compiling with -O3 -mcpu=power8 x.cpp -S -c -emit-llvm -fno-unroll-loops (see bottom of the email for IR)<br><br></div>Also I can write somewhat more complicated loop where step size is a constant > 1, and the conditions are so that IV will not overflow:<br><br>int test2 (int *x, int *y, int *z, int k) {<br> int sum = 0;<br> for (int i = 10; i < k && i < k-5; i+=5) {<br> z[i] = x[i] / y[i];<br> }<br> return sum;<br>}<br><br></div>again this is not canonicalized in the above sense (see IR at the end of the email). Maybe this condition is too complicated?<br><div><div><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br></div><div>IR for test1<br></div><div><br><br>for.body: ; preds = %for.body.preheader, %for.body<br><b> %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 10, %for.body.preheader ]</b><br> %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvars.iv<br> %0 = load i32, i32* %arrayidx, align 4, !tbaa !1<br> %arrayidx2 = getelementptr inbounds i32, i32* %y, i64 %indvars.iv<br> %1 = load i32, i32* %arrayidx2, align 4, !tbaa !1<br> %div = sdiv i32 %0, %1<br> %arrayidx4 = getelementptr inbounds i32, i32* %z, i64 %indvars.iv<br> store i32 %div, i32* %arrayidx4, align 4, !tbaa !1<br> %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1<br> %exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count<br> br i1 %exitcond, label %for.cond.cleanup, label %for.body<br><br></div><div>IR for test2<br></div><div><br>for.body: ; preds = %for.body.preheader, %for.body<br><b> %indvars.iv = phi i64 [ 10, %for.body.preheader ], [ %indvars.iv.next, %for.body ]</b><br> %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvars.iv<br> %2 = load i32, i32* %arrayidx, align 4, !tbaa !1<br> %arrayidx3 = getelementptr inbounds i32, i32* %y, i64 %indvars.iv<br> %3 = load i32, i32* %arrayidx3, align 4, !tbaa !1<br> %div = sdiv i32 %2, %3<br> %arrayidx5 = getelementptr inbounds i32, i32* %z, i64 %indvars.iv<br> store i32 %div, i32* %arrayidx5, align 4, !tbaa !1<br><b> %indvars.iv.next = add nuw i64 %indvars.iv, 5</b><br> %cmp = icmp slt i64 %indvars.iv.next, %1<br> %cmp1 = icmp slt i64 %indvars.iv.next, %0<br> %or.cond = and i1 %cmp, %cmp1<br> br i1 %or.cond, label %for.body, label %for.cond.cleanup.loopexit<br><br><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Aug 25, 2016 at 4:02 PM, Michael Kruse via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Not sure whether these are the actual reasons, but to explain the<br>
difficulties with those loops.<br>
<span class=""><br>
2016-08-25 3:48 GMT+02:00 Yaoqing Gao via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>>:<br>
> I just subscribed this group. This is my first time to post a question (not<br>
> sure if this is a right place for discussion) after I have a brief look at<br>
> LLVM OPT (dev trunk). I would expect loop simplification and induction<br>
> variable canonicalization pass (IndVarSimplify pass) should be able to<br>
> convert the following loops into a simple canonical form, i.e., there is a<br>
> canonical induction variable which starts at zero and steps by one,<br>
> getCanonicalInductionVariable(<wbr>) returns the first PHI node in the loop<br>
> header block.<br>
><br>
> int test1 (int x[], int k, int s) {<br>
> int sum = 0;<br>
> for (int i = 0; i < k; i+=s) {<br>
> sum += x[i];<br>
> }<br>
> return sum;<br>
> }<br>
<br>
</span>s could be zero making this an endless loop (C has some rules saying<br>
that it can assume that certain loops do terminate, but I think it<br>
does not apply to LLVM IR)<br>
<span class=""><br>
<br>
> int test2(int x[], int k, int s) {<br>
> int sum = 0;<br>
> for (int i = k; i > 0; i--) {<br>
> sum += x[i];<br>
> }<br>
> return sum;<br>
> }<br>
<br>
</span>with k = INT_MIN where is no upper limit in that range. Neither<br>
<br>
for (int j = 0; j < -INT_MIN; j++)<br>
<br>
nor<br>
<br>
for (int j = 0; j <= INT_MAX; j++)<br>
<br>
do work here.<br>
<span class=""><br>
><br>
> Anyone can help explain why the current LLVM cannot canonicalize induction<br>
> variables for the above loops (by design or a limitation to be fixed in the<br>
> future)? Thanks.<br>
<br>
</span>The first could be tackled with loop versioning of the s==0 case. The<br>
second might be converted to<br>
<br>
for (int j = -1; j < -(k+1); j++)<br>
<br>
although this isn't the canonical form.<br>
<br>
<br>
Michael<br>
<div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
</div></div></blockquote></div><br></div>