[llvm-dev] loop unrolling introduces conditional branch
Xiangyang Guo via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 21 07:51:19 PDT 2015
Hi, James and Philip, Thanks for your help.
Based on your advice, I downloaded llvm-3.7. However, with this new version
of LLVM, I have the following errors when I compile my previous code:
g++ -o parser main.o `llvm-config --libs all` `llvm-config --ldflags
--system-libs` -lpthread -ldl -rdynamic -ltinfo
main.o:(.data.rel.ro._ZTIN4llvm17GetElementPtrInstE[_ZTIN4llvm17GetElementPtrInstE]+0x10):
undefined reference to `typeinfo for llvm::Instruction'
main.o:(.data.rel.ro._ZTIN4llvm8ICmpInstE[_ZTIN4llvm8ICmpInstE]+0x10):
undefined reference to `typeinfo for llvm::CmpInst'
BTW, in my code, I use LLVM API (IRBuilder and so on) to generate one
Module and then use PassManager to add several passes. And my Makefile is
pretty simple, it looks like this:
***********************************************************************************************
all: parser
OBJS = main.o \
LLVMCONFIG = llvm-config
CPPFLAGS = `$(LLVMCONFIG) --cxxflags` -std=c++11
LDFLAGS = `$(LLVMCONFIG) --ldflags --system-libs` -lpthread -ldl -rdynamic
-ltinfo
LIBS = `$(LLVMCONFIG) --libs all`
clean:
$(RM) -rf parser $(OBJS)
%.o: %.cpp
g++ -g -c $(CPPFLAGS) -o $@ $<
parser: $(OBJS)
g++ -o $@ $(OBJS) $(LIBS) $(LDFLAGS)
**********************************************************************************************
Do you have any idea? Thanks a lot.
Regards,
Xiangyang
On Thu, Aug 20, 2015 at 2:23 PM, James Molloy <james at jamesmolloy.co.uk>
wrote:
> Hi Xiangyang,
>
> The algorithm for loop unrolling was changed post-3.5 to do more what
> you'd expect. If you use 3.6 or 3.7 you'll likely get better results.
>
> Cheers,
>
> James
>
> On Thu, 20 Aug 2015 at 18:09 Philip Reames via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On 08/20/2015 07:38 AM, Xiangyang Guo via llvm-dev wrote:
>>
>> Hi,
>>
>> I want to use loop unrolling pass, however, I find that loop unrolling
>> will introduces conditional branch at end of every "unrolled" part. For
>> example, consider the following code
>>
>> *void foo( int n, int array_x[])*
>> *{*
>> * for (int i=0; i < n; i++)*
>> * array_x[i] = i; *
>> *}*
>>
>> Then I use this command "opt-3.5 try.bc -mem2reg -loops -loop-simplify
>> -loop-rotate -lcssa -indvars -loop-unroll -unroll-count=3 -simplifycfg -S",
>> it gives me this IR:
>>
>> *define void @_Z3fooiPi(i32 %n, i32* %array_x) #0 {*
>> * %1 = icmp slt i32 0, %n*
>> * br i1 %1, label %.lr.ph <http://lr.ph/>, label %._crit_edge*
>>
>> *.lr.ph <http://lr.ph/>: ;
>> preds = %0, %7*
>> * %indvars.iv = phi i64 [ %indvars.iv.next.2, %7 ], [ 0, %0 ]*
>> * %2 = getelementptr inbounds i32* %array_x, i64 %indvars.iv*
>> * %3 = trunc i64 %indvars.iv to i32*
>> * store i32 %3, i32* %2*
>> * %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1*
>> * %lftr.wideiv = trunc i64 %indvars.iv.next to i32*
>> * %exitcond = icmp ne i32 %lftr.wideiv, %n*
>> * br i1 %exitcond, label %4, label %._crit_edge*
>>
>> *._crit_edge: ; preds = %.lr.ph
>> <http://lr.ph/>, %4, %7, %0*
>> * ret void*
>>
>> *; <label>:4 ; preds = %.lr.ph
>> <http://lr.ph/>*
>> * %5 = getelementptr inbounds i32* %array_x, i64 %indvars.iv.next*
>> * %6 = trunc i64 %indvars.iv.next to i32*
>> * store i32 %6, i32* %5*
>> * %indvars.iv.next.1 = add nuw nsw i64 %indvars.iv.next, 1*
>> * %lftr.wideiv.1 = trunc i64 %indvars.iv.next.1 to i32*
>> * %exitcond.1 = icmp ne i32 %lftr.wideiv.1, %n*
>> * br i1 %exitcond.1, label %7, label %._crit_edge*
>>
>> *; <label>:7 ; preds = %4*
>> * %8 = getelementptr inbounds i32* %array_x, i64 %indvars.iv.next.1*
>> * %9 = trunc i64 %indvars.iv.next.1 to i32*
>> * store i32 %9, i32* %8*
>> * %indvars.iv.next.2 = add nuw nsw i64 %indvars.iv.next.1, 1*
>> * %lftr.wideiv.2 = trunc i64 %indvars.iv.next.2 to i32*
>> * %exitcond.2 = icmp ne i32 %lftr.wideiv.2, %n*
>> * br i1 %exitcond.2, label %.lr.ph <http://lr.ph/>, label %._crit_edge*
>> *}*
>>
>> As you can see, at the end of BB <label>4 and BB<label>7 there are "add",
>> "icmp" and "br" instrcutions to check the boundary. I understand this is
>> for the correctness. However, I would expect the loop unrolling can change
>> my code to something like this:
>>
>> *void foo( int n, int array_x[])*
>> *{*
>> * int j = n%3;*
>> * int m = n - j;*
>> * for (int i=0; i < m; i+=3){*
>> * array_x[i] = i;*
>> * array_x[i+1] = i+1;*
>> * array_x[i+2] = i+2; *
>> * }*
>> * for(i=m; i<n; i++)*
>> * array_x[i] = i; *
>> *}*
>>
>> In this case, the BB<label>4 and BB<label>7 will do not have the "add",
>> "icmp" and "br" instructions because these BBs can be merged together.
>>
>> How can I achieve this? Thanks.
>>
>> One - rather heavy weight - way to do this would be to add the -irce pass
>> after the loop unroll step. InductiveRangeCheckElimination will introduce
>> a post loop so as to eliminate the range checks in the inner loop. This
>> might not be the ideal transformation for this code, but it might get you
>> closer to what you want.
>>
>> A couple of caveats:
>> - 3.5 isn't recent enough to have a stable IRCE. Download ToT.
>> - IRCE requires profiling information on the branches. I'd start by
>> manually annotating your IR to see if it works, then exploring a profile
>> build if it does.
>>
>> For the record, teaching the unroller to do this transformation (or a
>> creating a new pass) would seem interesting. You might check with Chandler
>> and/or Michael (see recent review threads) for what their plans in this
>> area are.
>>
>>
>> Regards,
>>
>> Xiangyang
>>
>>
>> _______________________________________________
>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150821/9b888178/attachment.html>
More information about the llvm-dev
mailing list