[llvm-dev] loop unrolling introduces conditional branch

Xiangyang Guo via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 21 08:29:45 PDT 2015


Hi, Jeremy,

Thanks for your reply. I tried -fno-rtti yesterday and no luck.

Regards,

Xiangyang

On Fri, Aug 21, 2015 at 11:05 AM, Jeremy Lakeman <Jeremy.Lakeman at gmail.com>
wrote:

> There's been some recent noise on the mailing list about requiring
> -fno-rtti;
> http://lists.llvm.org/pipermail/llvm-dev/2015-August/089010.html
>
> Could that be it?
>
> On Sat, Aug 22, 2015 at 12:21 AM, Xiangyang Guo via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi, James and Philip, Thanks for your help.
>>
>> Based on your advice, I downloaded llvm-3.7. However, with this new
>> version of LLVM, I have the following errors when I compile my previous
>> code:
>>
>> g++ -o parser main.o  `llvm-config --libs all` `llvm-config --ldflags
>> --system-libs` -lpthread -ldl -rdynamic -ltinfo
>> main.o:(.data.rel.ro._ZTIN4llvm17GetElementPtrInstE[_ZTIN4llvm17GetElementPtrInstE]+0x10):
>> undefined reference to `typeinfo for llvm::Instruction'
>> main.o:(.data.rel.ro._ZTIN4llvm8ICmpInstE[_ZTIN4llvm8ICmpInstE]+0x10):
>> undefined reference to `typeinfo for llvm::CmpInst'
>>
>> BTW, in my code, I use LLVM API (IRBuilder and so on) to generate one
>> Module and then use PassManager to add several passes. And my Makefile is
>> pretty simple, it looks like this:
>>
>> ***********************************************************************************************
>> all: parser
>>
>> OBJS = main.o    \
>>
>> LLVMCONFIG = llvm-config
>> CPPFLAGS = `$(LLVMCONFIG) --cxxflags` -std=c++11
>> LDFLAGS = `$(LLVMCONFIG) --ldflags --system-libs` -lpthread -ldl
>> -rdynamic -ltinfo
>> LIBS = `$(LLVMCONFIG) --libs all`
>>
>> clean:
>> $(RM) -rf parser $(OBJS)
>>
>> %.o: %.cpp
>> g++ -g -c $(CPPFLAGS) -o $@ $<
>>
>>
>> parser: $(OBJS)
>> g++ -o $@ $(OBJS) $(LIBS) $(LDFLAGS)
>>
>> **********************************************************************************************
>> Do you have any idea? Thanks a lot.
>>
>> Regards,
>>
>> Xiangyang
>>
>> On Thu, Aug 20, 2015 at 2:23 PM, James Molloy <james at jamesmolloy.co.uk>
>> wrote:
>>
>>> Hi Xiangyang,
>>>
>>> The algorithm for loop unrolling was changed post-3.5 to do more what
>>> you'd expect. If you use 3.6 or 3.7 you'll likely get better results.
>>>
>>> Cheers,
>>>
>>> James
>>>
>>> On Thu, 20 Aug 2015 at 18:09 Philip Reames via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> On 08/20/2015 07:38 AM, Xiangyang Guo via llvm-dev wrote:
>>>>
>>>> Hi,
>>>>
>>>> I want to use loop unrolling pass, however, I find that loop unrolling
>>>> will introduces conditional branch at end of every "unrolled" part. For
>>>> example, consider the following code
>>>>
>>>> *void foo( int n, int array_x[])*
>>>> *{*
>>>> *    for (int i=0; i < n; i++)*
>>>> *     array_x[i] = i; *
>>>> *}*
>>>>
>>>> Then I use this command "opt-3.5 try.bc -mem2reg -loops -loop-simplify
>>>> -loop-rotate -lcssa -indvars -loop-unroll -unroll-count=3 -simplifycfg -S",
>>>> it gives me this IR:
>>>>
>>>> *define void @_Z3fooiPi(i32 %n, i32* %array_x) #0 {*
>>>> *  %1 = icmp slt i32 0, %n*
>>>> *  br i1 %1, label %.lr.ph <http://lr.ph/>, label %._crit_edge*
>>>>
>>>> *.lr.ph <http://lr.ph/>:                                           ;
>>>> preds = %0, %7*
>>>> *  %indvars.iv = phi i64 [ %indvars.iv.next.2, %7 ], [ 0, %0 ]*
>>>> *  %2 = getelementptr inbounds i32* %array_x, i64 %indvars.iv*
>>>> *  %3 = trunc i64 %indvars.iv to i32*
>>>> *  store i32 %3, i32* %2*
>>>> *  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1*
>>>> *  %lftr.wideiv = trunc i64 %indvars.iv.next to i32*
>>>> *  %exitcond = icmp ne i32 %lftr.wideiv, %n*
>>>> *  br i1 %exitcond, label %4, label %._crit_edge*
>>>>
>>>> *._crit_edge:                                      ; preds = %.lr.ph
>>>> <http://lr.ph/>, %4, %7, %0*
>>>> *  ret void*
>>>>
>>>> *; <label>:4                                       ; preds = %.lr.ph
>>>> <http://lr.ph/>*
>>>> *  %5 = getelementptr inbounds i32* %array_x, i64 %indvars.iv.next*
>>>> *  %6 = trunc i64 %indvars.iv.next to i32*
>>>> *  store i32 %6, i32* %5*
>>>> *  %indvars.iv.next.1 = add nuw nsw i64 %indvars.iv.next, 1*
>>>> *  %lftr.wideiv.1 = trunc i64 %indvars.iv.next.1 to i32*
>>>> *  %exitcond.1 = icmp ne i32 %lftr.wideiv.1, %n*
>>>> *  br i1 %exitcond.1, label %7, label %._crit_edge*
>>>>
>>>> *; <label>:7                                       ; preds = %4*
>>>> *  %8 = getelementptr inbounds i32* %array_x, i64 %indvars.iv.next.1*
>>>> *  %9 = trunc i64 %indvars.iv.next.1 to i32*
>>>> *  store i32 %9, i32* %8*
>>>> *  %indvars.iv.next.2 = add nuw nsw i64 %indvars.iv.next.1, 1*
>>>> *  %lftr.wideiv.2 = trunc i64 %indvars.iv.next.2 to i32*
>>>> *  %exitcond.2 = icmp ne i32 %lftr.wideiv.2, %n*
>>>> *  br i1 %exitcond.2, label %.lr.ph <http://lr.ph/>, label %._crit_edge*
>>>> *}*
>>>>
>>>> As you can see, at the end of BB <label>4 and BB<label>7 there are
>>>> "add", "icmp" and "br" instrcutions to check the boundary. I understand
>>>> this is for the correctness. However, I would expect the loop unrolling can
>>>> change my code to something like this:
>>>>
>>>> *void foo( int n, int array_x[])*
>>>> *{*
>>>> *    int j = n%3;*
>>>> *    int m = n - j;*
>>>> *    for (int i=0; i < m; i+=3){*
>>>> *     array_x[i] = i;*
>>>> *     array_x[i+1] = i+1;*
>>>> *     array_x[i+2] = i+2; *
>>>> *    }*
>>>> *    for(i=m; i<n; i++)*
>>>> *     array_x[i] = i;        *
>>>> *}*
>>>>
>>>> In this case, the BB<label>4 and BB<label>7 will do not have the "add",
>>>> "icmp" and "br" instructions because these BBs can be merged together.
>>>>
>>>> How can I achieve this? Thanks.
>>>>
>>>> One - rather heavy weight - way to do this would be to add the -irce
>>>> pass after the loop unroll step.  InductiveRangeCheckElimination will
>>>> introduce a post loop so as to eliminate the range checks in the inner
>>>> loop.  This might not be the ideal transformation for this code, but it
>>>> might get you closer to what you want.
>>>>
>>>> A couple of caveats:
>>>> - 3.5 isn't recent enough to have a stable IRCE.  Download ToT.
>>>> - IRCE requires profiling information on the branches.  I'd start by
>>>> manually annotating your IR to see if it works, then exploring a profile
>>>> build if it does.
>>>>
>>>> For the record, teaching the unroller to do this transformation (or a
>>>> creating a new pass) would seem interesting.  You might check with Chandler
>>>> and/or Michael (see recent review threads) for what their plans in this
>>>> area are.
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Xiangyang
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150821/f263d950/attachment.html>


More information about the llvm-dev mailing list