[llvm-dev] loop unrolling introduces conditional branch
Xiangyang Guo via llvm-dev
llvm-dev at lists.llvm.org
Thu Aug 20 07:38:53 PDT 2015
Hi,
I want to use loop unrolling pass, however, I find that loop unrolling will
introduces conditional branch at end of every "unrolled" part. For example,
consider the following code
*void foo( int n, int array_x[])*
*{*
* for (int i=0; i < n; i++)*
* array_x[i] = i; *
*}*
Then I use this command "opt-3.5 try.bc -mem2reg -loops -loop-simplify
-loop-rotate -lcssa -indvars -loop-unroll -unroll-count=3 -simplifycfg -S",
it gives me this IR:
*define void @_Z3fooiPi(i32 %n, i32* %array_x) #0 {*
* %1 = icmp slt i32 0, %n*
* br i1 %1, label %.lr.ph <http://lr.ph/>, label %._crit_edge*
*.lr.ph <http://lr.ph/>: ; preds
= %0, %7*
* %indvars.iv = phi i64 [ %indvars.iv.next.2, %7 ], [ 0, %0 ]*
* %2 = getelementptr inbounds i32* %array_x, i64 %indvars.iv*
* %3 = trunc i64 %indvars.iv to i32*
* store i32 %3, i32* %2*
* %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1*
* %lftr.wideiv = trunc i64 %indvars.iv.next to i32*
* %exitcond = icmp ne i32 %lftr.wideiv, %n*
* br i1 %exitcond, label %4, label %._crit_edge*
*._crit_edge: ; preds = %.lr.ph
<http://lr.ph/>, %4, %7, %0*
* ret void*
*; <label>:4 ; preds = %.lr.ph
<http://lr.ph/>*
* %5 = getelementptr inbounds i32* %array_x, i64 %indvars.iv.next*
* %6 = trunc i64 %indvars.iv.next to i32*
* store i32 %6, i32* %5*
* %indvars.iv.next.1 = add nuw nsw i64 %indvars.iv.next, 1*
* %lftr.wideiv.1 = trunc i64 %indvars.iv.next.1 to i32*
* %exitcond.1 = icmp ne i32 %lftr.wideiv.1, %n*
* br i1 %exitcond.1, label %7, label %._crit_edge*
*; <label>:7 ; preds = %4*
* %8 = getelementptr inbounds i32* %array_x, i64 %indvars.iv.next.1*
* %9 = trunc i64 %indvars.iv.next.1 to i32*
* store i32 %9, i32* %8*
* %indvars.iv.next.2 = add nuw nsw i64 %indvars.iv.next.1, 1*
* %lftr.wideiv.2 = trunc i64 %indvars.iv.next.2 to i32*
* %exitcond.2 = icmp ne i32 %lftr.wideiv.2, %n*
* br i1 %exitcond.2, label %.lr.ph <http://lr.ph/>, label %._crit_edge*
*}*
As you can see, at the end of BB <label>4 and BB<label>7 there are "add",
"icmp" and "br" instrcutions to check the boundary. I understand this is
for the correctness. However, I would expect the loop unrolling can change
my code to something like this:
*void foo( int n, int array_x[])*
*{*
* int j = n%3;*
* int m = n - j;*
* for (int i=0; i < m; i+=3){*
* array_x[i] = i;*
* array_x[i+1] = i+1;*
* array_x[i+2] = i+2; *
* }*
* for(i=m; i<n; i++)*
* array_x[i] = i; *
*}*
In this case, the BB<label>4 and BB<label>7 will do not have the "add",
"icmp" and "br" instructions because these BBs can be merged together.
How can I achieve this? Thanks.
Regards,
Xiangyang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150820/3c898a21/attachment.html>
More information about the llvm-dev
mailing list