[llvm-dev] Question about Unrolling Loop with Multiple Exits

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Tue Aug 3 11:30:12 PDT 2021


JFYI, https://reviews.llvm.org/D107381 works in this direction, and the 
reduced test case does unroll with that patch and the following command 
line.

$ opt -loop-unroll  -enable-new-pm=0 < llvm-dev.ll -S -unroll-runtime 
-debug -unroll-runtime-multi-exit -unroll-runtime-epilog

On 7/20/21 10:17 AM, Philip Reames wrote:
> ./opt -loop-unroll  -enable-new-pm=0 < llvm-dev.ll -S -unroll-runtime 
> -debug
> Args: ./opt -loop-unroll -enable-new-pm=0 -S -unroll-runtime -debug
> Loop Unroll: F[test] Loop %while.body.us
>   Loop Size = 10
>   runtime unrolling with count: 8
>   Exiting block %if.end8.us: TripCount=0, TripMultiple=1, BreakoutTrip=1
>   Exiting block %while.cond.backedge: TripCount=0, TripMultiple=1, 
> BreakoutTrip=1
> Trying runtime unrolling on Loop:
> Loop at depth 1 containing: 
> %while.body.us<header>,%if.end8.us<exiting>,%while.cond.backedge<latch><exiting>
> Using prolog remainder.
> Bailout for multi-exit handling when latch exit has >1 predecessor.
> Multiple exit/exiting blocks in loop and multi-exit unrolling not enabled!
> Won't unroll; remainder loop could not be generated when assuming 
> runtime trip count
> ; ModuleID = '<stdin>'
> source_filename = "<stdin>"
>
> declare i32 @foo(i8*)
>
> define void @test(i8* %s, i64 %a) {
> entry:
>   %s.addr.a = getelementptr i8, i8* %s, i64 %a
>   br label %while.body.us
>
> while.body.us:                                    ; preds = 
> %while.cond.backedge, %entry
>   %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ]
>   %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1
>   %incdec.val = load i8, i8* %s.addr, align 1
>   %cmp1 = icmp eq i8 %incdec.val, 10
>   br i1 %cmp1, label %if.end8.us, label %while.cond.backedge
>
> if.end8.us:                                       ; preds = %while.body.us
>   %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr)
>   %cmp2 = icmp ult i32 %call9, 0
>   br i1 %cmp2, label %while.cond.backedge, label %return.loopexit
>
> while.cond.backedge:                              ; preds = 
> %if.end8.us, %while.body.us
>   %cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a
>   br i1 %cmp3, label %return.loopexit, label %while.body.us
>
> return.loopexit:                                  ; preds = 
> %while.cond.backedge, %if.end8.us
>   ret void
> }
>
> The debug output appears to give a pretty good clue as to where you 
> could start if desired.
>
> Philip
>
> On 7/20/21 9:45 AM, Jingu Kang wrote:
>>
>> Hi Philip,
>>
>> I have reduced the test case roughly as below.
>>
>> declare i32 @foo(i8 *)
>>
>> define void @test(i8* %s, i64 %a) {
>>
>> entry:
>>
>>   %s.addr.a = getelementptr i8, i8* %s, i64 %a
>>
>>   br label %while.body.us
>>
>> while.body.us:
>>
>>   %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ]
>>
>>   %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1
>>
>>   %incdec.val = load i8, i8* %s.addr, align 1
>>
>>   %cmp1 = icmp eq i8 %incdec.val, 10
>>
>>   br i1 %cmp1, label %if.end8.us, label %while.cond.backedge
>>
>> if.end8.us:
>>
>>   %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr)
>>
>>   %cmp2 = icmp ult i32 %call9, 0
>>
>>   br i1 %cmp2, label %while.cond.backedge, label %return.loopexit
>>
>> while.cond.backedge:
>>
>>   %cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a
>>
>>   br i1 %cmp3, label %return.loopexit, label %while.body.us
>>
>> return.loopexit:
>>
>>   ret void
>>
>> }
>>
>> Roughly, I unrolled the loop manually 2 times as below and ignored 
>> the remaining loop simply.
>>
>> define void @test(i8* %s, i64 %a) {
>>
>> entry:
>>
>>   %s.addr.a = getelementptr i8, i8* %s, i64 %a
>>
>>   br label %while.body.us
>>
>> while.body.us:
>>
>>   %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ]
>>
>>   %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1
>>
>>   %incdec.val = load i8, i8* %s.addr, align 1
>>
>>   %cmp1 = icmp eq i8 %incdec.val, 10
>>
>>   br i1 %cmp1, label %if.end8.us, label %while.body.us.1
>>
>> while.body.us.1:
>>
>>   %incdec.ptr.1 = getelementptr inbounds i8, i8* %incdec.ptr, i64 1
>>
>>   %incdec.val.1 = load i8, i8* %incdec.ptr, align 1
>>
>>   %cmp1.1 = icmp eq i8 %incdec.val.1, 10
>>
>>   br i1 %cmp1.1, label %if.end8.us, label %while.cond.backedge
>>
>> if.end8.us:
>>
>>   %incdec.ptr.phi = phi i8* [ %incdec.ptr, %while.body.us ], [ 
>> %incdec.ptr.1, %while.body.us.1 ]
>>
>>   %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr.phi)
>>
>>   %cmp2 = icmp ult i32 %call9, 0
>>
>>   br i1 %cmp2, label %while.cond.backedge, label %return.loopexit
>>
>> while.cond.backedge:
>>
>>   %cmp3 = icmp eq i8* %incdec.ptr.1, %s.addr.a
>>
>>   br i1 %cmp3, label %return.loopexit, label %while.body.us
>>
>> return.loopexit:
>>
>>   ret void
>>
>> }
>>
>> If possible, can we make loop unroll pass handle this kind of cases 
>> please?
>>
>> If I missed something, please let me know.
>>
>> Thanks
>>
>> JinGu Kang
>>
>> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of 
>> *Philip Reames via llvm-dev
>> *Sent:* 16 July 2021 20:12
>> *To:* Jingu Kang <Jingu.Kang at arm.com>
>> *Cc:* llvm-dev at lists.llvm.org
>> *Subject:* Re: [llvm-dev] Question about Unrolling Loop with Multiple 
>> Exits
>>
>> Jingu,
>>
>> I'm not fluent in AArch64 assembly, sorry.� If you want more 
>> meaningful commentary, you'll need to reduce your c++ example 
>> further, and give a better explanation of what you expect the 
>> unroller to do for you.�
>>
>> Philip
>>
>> On 7/16/21 12:09 PM, Jingu Kang wrote:
>>
>>     Sorry for poor example�
>>
>>     �
>>
>>     The AArch64 assembly output of the example from gcc is as below.
>>     The loop is unrolled 7 times. I have written some comments to
>>     explain how the assembly code is mapped to C source code. As you
>>     can see on `.L3` label, the �if (*s++ != '\n')� block is
>>     unrolled 7 times.
>>
>>     �
>>
>>     ������� stp���� x29, x30, [sp, -48]!
>>
>>     ������� mov���� x29, sp
>>
>>     ������� stp���� x19, x20, [sp, 16]
>>
>>     ������� mov���� x19, x0� --> x19 is char *s
>>
>>     ������� mov���� x20, x1� --> x20 is char *end
>>
>>     ������� str���� x21, [sp, 32]
>>
>>     ������� orr���� x21, x3, x2� --> x21 is
>>     check1 | check2
>>
>>     .L70:
>>
>>     ������� sub���� x0, x20, x19� --> x0 =
>>     end - s;
>>
>>     ������� add���� x1, x0, 1
>>
>>     ������� ands��� x2, x1, 7� --> unroll count
>>     is 7
>>
>>     ������� beq���� .L3���� --> .L3 is
>>     inside while loop. if (*s++ != '\n')
>>
>>     ������� cmp���� x19, x20 �--> while(s <= end)
>>
>>     ������� bhi���� .L2���� --> .L2 is
>>     label ret1.
>>
>>     �
>>
>>     ������� ldrb��� w3, [x19], 1� --> start of
>>     remainder
>>
>>     ������� cmp���� w3, 10
>>
>>     ������� beq���� .L71
>>
>>     ������� cmp���� x2, 1
>>
>>     ������� beq���� .L3
>>
>>     ������� cmp���� x2, 2
>>
>>     ������� beq���� .L49
>>
>>     ������� cmp���� x2, 3
>>
>>     ������� beq���� .L50
>>
>>     ������� cmp���� x2, 4
>>
>>     ������� beq���� .L51
>>
>>     ������� cmp���� x2, 5
>>
>>     ������� beq���� .L52
>>
>>     ������� cmp���� x2, 6
>>
>>     ������� beq���� .L53
>>
>>     ������� ldrb��� w4, [x19], 1
>>
>>     ������� cmp���� w4, 10
>>
>>     ������� bne���� .L53
>>
>>     .L71:
>>
>>     ������� cbz���� x21, .L4� --> if(check1
>>     || check2)
>>
>>     ������� bl����� foo()
>>
>>     ������� mov���� x19, x0
>>
>>     ������� cbz���� x0, .L2�� --> if (!s)
>>     goto ret1;
>>
>>     �
>>
>>     .L4:���������������������
>>     --> if(boo(s)) goto ret0;
>>
>>     ������� mov���� x0, x19
>>
>>     ����� ��bl����� boo(char*)
>>
>>     ������� cbz���� w0, .L70
>>
>>     ������� mov���� w0, 0
>>
>>     ������� ldp���� x19, x20, [sp, 16]
>>
>>     ������� ldr���� x21, [sp, 32]
>>
>>     ������� ldp���� x29, x30, [sp], 48
>>
>>     ������� ret
>>
>>     .L53:��������� --> if (*s++ != '\n') for remainder
>>
>>     ������� ldrb��� w5, [x19], 1
>>
>>     ������� cmp���� w5, 10
>>
>>     ������� beq���� .L71
>>
>>     .L52:��������� --> if (*s++ != '\n') for remainder
>>
>>     ������� ldrb��� w6, [x19], 1
>>
>>     ������� cmp���� w6, 10
>>
>>     ������� beq���� .L71
>>
>>     .L51:��������� --> if (*s++ != '\n') for remainder
>>
>>     ������� ldrb��� w7, [x19], 1
>>
>>     ������� cmp���� w7, 10
>>
>>     ������� beq���� .L71
>>
>>     .L50:��������� --> if (*s++ != '\n') for remainder
>>
>>     ������� ldrb��� w8, [x19], 1
>>
>>     ������� cmp���� w8, 10
>>
>>     ������� beq���� .L71
>>
>>     .L49:��������� --> if (*s++ != '\n') for remainder
>>
>>     ������� ldrb��� w9, [x19], 1
>>
>>     ������� cmp���� w9, 10
>>
>>     ������� beq���� .L71
>>
>>     .L3:�������������������������
>>     --> if (*s++ != '\n'), 7 times unrolled
>>
>>     ������� cmp���� x19, x20
>>
>>     ������� bhi���� .L2
>>
>>     ������� ldrb��� w10, [x19]
>>
>>     ������� add�� ��x19, x19, 1
>>
>>     ������� mov���� x11, x19
>>
>>     ������� cmp���� w10, 10
>>
>>     ������� beq���� .L71
>>
>>     ������� ldrb��� w12, [x19], 1
>>
>>     ������� cmp���� w12, 10
>>
>>     ������� beq���� .L71
>>
>>     ������� ldrb��� w13, [x11, 1]
>>
>>     ������� add���� x19, x11, 2
>>
>>     ������� cmp���� w13, 10
>>
>>     ������� beq��� �.L71
>>
>>     ������� ldrb��� w14, [x11, 2]
>>
>>     ������� add���� x19, x11, 3
>>
>>     ������� cmp���� w14, 10
>>
>>     ������� beq���� .L71
>>
>>     ������� ldrb��� w15, [x11, 3]
>>
>>     ������� add���� x19, x11, 4
>>
>>     ������� cmp���� w15, 10
>>
>>     ������� beq���� .L71
>>
>>     ������� ldrb��� w16, [x11, 4]
>>
>>     ������� add���� x19, x11, 5
>>
>>     ������� cmp���� w16, 10
>>
>>     ������� beq���� .L71
>>
>>     ������� ldrb��� w17, [x11, 5]
>>
>>     ������� add���� x19, x11, 6
>>
>>     ������� cmp���� w17, 10
>>
>>     ������� beq���� .L71
>>
>>     ������� ldrb��� w18, [x11, 6]
>>
>>     ������� add���� x19, x11, 7
>>
>>     ������� cmp���� w18, 10
>>
>>     ������� beq���� .L71
>>
>>     ������� b������ .L3
>>
>>     .L2:���������������������
>>     --> label ret1
>>
>>     ������� mov���� w0, 1
>>
>>     ������� ldp���� x19, x20, [sp, 16]
>>
>>     ������� ldr���� x21, [sp, 32]
>>
>>     ������� ldp���� x29, x30, [sp], 48
>>
>>     ������� ret
>>
>>     �
>>
>>     I am sorry for showing you assembly output directly� It looks
>>     like the rtl level�s unroll pass of gcc unrolls above loop and
>>     I need to check it next week. Once I have clear idea about above
>>     unrolling from gcc, let me reduce the example more and let you know.
>>
>>     �
>>
>>     Thanks
>>
>>     JinGu Kang
>>
>>     �
>>
>>     *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org>
>>     <mailto:llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Philip
>>     Reames via llvm-dev
>>     *Sent:* 16 July 2021 18:27
>>     *To:* Jingu Kang <Jingu.Kang at arm.com> <mailto:Jingu.Kang at arm.com>
>>     *Cc:* llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>     *Subject:* Re: [llvm-dev] Question about Unrolling Loop with
>>     Multiple Exits
>>
>>     �
>>
>>     �
>>
>>     On 7/16/21 9:29 AM, Jingu Kang wrote:
>>
>>         Hi Philip,
>>
>>         Thanks for your kind reply.
>>
>>         > A) Are you measuring on tip of tree?� There were changes
>>         for multiple exit unrolling which landed very recently.
>>
>>         Yep, I am investigating benchmarks with llvm tip and I can
>>         see the llvm fails to unroll some loops with multiple exits.
>>
>>         > B) One of your exits does not dominate your latch.� Those
>>         are generally hard
>>
>>         > C) This example does not seem to require gotos.� I
>>         strongly suggest reducing your test cases if you want more
>>         informed commentary.�
>>
>>         �
>>
>>         I am looking at perlbench recently and it has `goto`
>>         statements inside loop. The example is a reduced case.
>>
>>     Right, but the gotos aren't relevant for your reduced test.�
>>     You can reduce further.
>>
>>
>>         When I look at the gcc�s output of the example, it looks
>>         like gcc unrolls only the below `if` statement block�
>>
>>         �
>>
>>         ��� if (*s++ != '\n')
>>
>>         ����� continue;
>>
>>     Your phrasing here does not parse for me.� Can you restate this
>>     with different wording and maybe a fully worked example?
>>
>>
>>         �
>>
>>         Thanks
>>
>>         JinGu Kang
>>
>>         �
>>
>>         *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org>
>>         <mailto:llvm-dev-bounces at lists.llvm.org> *On Behalf Of
>>         *Philip Reames via llvm-dev
>>         *Sent:* 16 July 2021 15:52
>>         *To:* Jingu Kang <Jingu.Kang at arm.com>
>>         <mailto:Jingu.Kang at arm.com>; llvm-dev at lists.llvm.org
>>         <mailto:llvm-dev at lists.llvm.org>
>>         *Subject:* Re: [llvm-dev] Question about Unrolling Loop with
>>         Multiple Exits
>>
>>         �
>>
>>         A) Are you measuring on tip of tree?� There were changes
>>         for multiple exit unrolling which landed very recently.
>>
>>         B) One of your exits does not dominate your latch.� Those
>>         are generally hard.�
>>
>>         C) This example does not seem to require gotos.� I strongly
>>         suggest reducing your test cases if you want more informed
>>         commentary.�
>>
>>         Philip
>>
>>         On 7/16/21 7:42 AM, Jingu Kang via llvm-dev wrote:
>>
>>             Hi All,
>>
>>             �
>>
>>             While I am investigating benchmarks, I have found loops
>>             which llvm fails to unroll because the loops have
>>             multiple exits.
>>
>>             For example,
>>
>>             �
>>
>>             char *foo(void);
>>
>>             int boo(char *s);
>>
>>             �
>>
>>             int test(char *s, char *end, char *check1, char *check2) {
>>
>>             � while (s <= end) {
>>
>>             ��� if (*s++ != '\n')
>>
>>             ����� continue;
>>
>>             ��� if (check1 || check2) {
>>
>>             ����� s = foo();
>>
>>             ����� if (!s)
>>
>>             ������� goto ret1;
>>
>>             ��� }��
>>
>>             ����if (boo(s))
>>
>>             ����� goto ret0;
>>
>>             � }
>>
>>             � goto ret1;
>>
>>             �
>>
>>             ret0:
>>
>>             � return 0;
>>
>>             ret1:
>>
>>             � return 1;
>>
>>             }
>>
>>             �
>>
>>             Above code causes below messages from LoopUnroll pass.
>>
>>             �
>>
>>             Bailout for multi-exit handling when latch exit has >1
>>             predecessor.
>>
>>             Multiple exit/exiting blocks in loop and multi-exit
>>             unrolling not enabled!
>>
>>             �
>>
>>             I can see the option `unroll-runtime-multi-exit` and
>>             comments about it. I wonder there are already reviews for
>>             the work on phabriactor or some people are working on it.
>>
>>             If someone knows information about it, please share it.
>>
>>             �
>>
>>             Thanks
>>
>>             JinGu Kang
>>
>>             �
>>
>>             �
>>
>>             �
>>
>>             �
>>
>>             �
>>
>>             �
>>
>>
>>
>>
>>
>>             _______________________________________________
>>
>>             LLVM Developers mailing list
>>
>>             llvm-dev at lists.llvm.org  <mailto:llvm-dev at lists.llvm.org>
>>
>>             https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev  <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210803/b0f20a57/attachment.html>


More information about the llvm-dev mailing list