[llvm-dev] Question about Unrolling Loop with Multiple Exits

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Tue Jul 20 10:17:13 PDT 2021


./opt -loop-unroll  -enable-new-pm=0 < llvm-dev.ll -S -unroll-runtime -debug
Args: ./opt -loop-unroll -enable-new-pm=0 -S -unroll-runtime -debug
Loop Unroll: F[test] Loop %while.body.us
   Loop Size = 10
   runtime unrolling with count: 8
   Exiting block %if.end8.us: TripCount=0, TripMultiple=1, BreakoutTrip=1
   Exiting block %while.cond.backedge: TripCount=0, TripMultiple=1, 
BreakoutTrip=1
Trying runtime unrolling on Loop:
Loop at depth 1 containing: 
%while.body.us<header>,%if.end8.us<exiting>,%while.cond.backedge<latch><exiting>
Using prolog remainder.
Bailout for multi-exit handling when latch exit has >1 predecessor.
Multiple exit/exiting blocks in loop and multi-exit unrolling not enabled!
Won't unroll; remainder loop could not be generated when assuming 
runtime trip count
; ModuleID = '<stdin>'
source_filename = "<stdin>"

declare i32 @foo(i8*)

define void @test(i8* %s, i64 %a) {
entry:
   %s.addr.a = getelementptr i8, i8* %s, i64 %a
   br label %while.body.us

while.body.us:                                    ; preds = 
%while.cond.backedge, %entry
   %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ]
   %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1
   %incdec.val = load i8, i8* %s.addr, align 1
   %cmp1 = icmp eq i8 %incdec.val, 10
   br i1 %cmp1, label %if.end8.us, label %while.cond.backedge

if.end8.us:                                       ; preds = %while.body.us
   %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr)
   %cmp2 = icmp ult i32 %call9, 0
   br i1 %cmp2, label %while.cond.backedge, label %return.loopexit

while.cond.backedge:                              ; preds = %if.end8.us, 
%while.body.us
   %cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a
   br i1 %cmp3, label %return.loopexit, label %while.body.us

return.loopexit:                                  ; preds = 
%while.cond.backedge, %if.end8.us
   ret void
}

The debug output appears to give a pretty good clue as to where you 
could start if desired.

Philip

On 7/20/21 9:45 AM, Jingu Kang wrote:
>
> Hi Philip,
>
> I have reduced the test case roughly as below.
>
> declare i32 @foo(i8 *)
>
> define void @test(i8* %s, i64 %a) {
>
> entry:
>
>   %s.addr.a = getelementptr i8, i8* %s, i64 %a
>
>   br label %while.body.us
>
> while.body.us:
>
>   %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ]
>
>   %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1
>
>   %incdec.val = load i8, i8* %s.addr, align 1
>
>   %cmp1 = icmp eq i8 %incdec.val, 10
>
>   br i1 %cmp1, label %if.end8.us, label %while.cond.backedge
>
> if.end8.us:
>
>   %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr)
>
>   %cmp2 = icmp ult i32 %call9, 0
>
>   br i1 %cmp2, label %while.cond.backedge, label %return.loopexit
>
> while.cond.backedge:
>
>   %cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a
>
>   br i1 %cmp3, label %return.loopexit, label %while.body.us
>
> return.loopexit:
>
>   ret void
>
> }
>
> Roughly, I unrolled the loop manually 2 times as below and ignored the 
> remaining loop simply.
>
> define void @test(i8* %s, i64 %a) {
>
> entry:
>
>   %s.addr.a = getelementptr i8, i8* %s, i64 %a
>
>   br label %while.body.us
>
> while.body.us:
>
>   %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ]
>
>   %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1
>
>   %incdec.val = load i8, i8* %s.addr, align 1
>
>   %cmp1 = icmp eq i8 %incdec.val, 10
>
>   br i1 %cmp1, label %if.end8.us, label %while.body.us.1
>
> while.body.us.1:
>
>   %incdec.ptr.1 = getelementptr inbounds i8, i8* %incdec.ptr, i64 1
>
>   %incdec.val.1 = load i8, i8* %incdec.ptr, align 1
>
>   %cmp1.1 = icmp eq i8 %incdec.val.1, 10
>
>   br i1 %cmp1.1, label %if.end8.us, label %while.cond.backedge
>
> if.end8.us:
>
>   %incdec.ptr.phi = phi i8* [ %incdec.ptr, %while.body.us ], [ 
> %incdec.ptr.1, %while.body.us.1 ]
>
>   %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr.phi)
>
>   %cmp2 = icmp ult i32 %call9, 0
>
>   br i1 %cmp2, label %while.cond.backedge, label %return.loopexit
>
> while.cond.backedge:
>
>   %cmp3 = icmp eq i8* %incdec.ptr.1, %s.addr.a
>
>   br i1 %cmp3, label %return.loopexit, label %while.body.us
>
> return.loopexit:
>
>   ret void
>
> }
>
> If possible, can we make loop unroll pass handle this kind of cases 
> please?
>
> If I missed something, please let me know.
>
> Thanks
>
> JinGu Kang
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of 
> *Philip Reames via llvm-dev
> *Sent:* 16 July 2021 20:12
> *To:* Jingu Kang <Jingu.Kang at arm.com>
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] Question about Unrolling Loop with Multiple 
> Exits
>
> Jingu,
>
> I'm not fluent in AArch64 assembly, sorry.� If you want more 
> meaningful commentary, you'll need to reduce your c++ example further, 
> and give a better explanation of what you expect the unroller to do 
> for you.�
>
> Philip
>
> On 7/16/21 12:09 PM, Jingu Kang wrote:
>
>     Sorry for poor example�
>
>     �
>
>     The AArch64 assembly output of the example from gcc is as below.
>     The loop is unrolled 7 times. I have written some comments to
>     explain how the assembly code is mapped to C source code. As you
>     can see on `.L3` label, the �if (*s++ != '\n')� block is
>     unrolled 7 times.
>
>     �
>
>     ������� stp���� x29, x30, [sp, -48]!
>
>     ������� mov���� x29, sp
>
>     ������� stp���� x19, x20, [sp, 16]
>
>     ������� mov���� x19, x0� --> x19 is char *s
>
>     ������� mov���� x20, x1� --> x20 is char *end
>
>     ������� str���� x21, [sp, 32]
>
>     ������� orr���� x21, x3, x2� --> x21 is
>     check1 | check2
>
>     .L70:
>
>     ������� sub���� x0, x20, x19� --> x0 = end
>     - s;
>
>     ������� add���� x1, x0, 1
>
>     ������� ands��� x2, x1, 7� --> unroll count is 7
>
>     ������� beq���� .L3���� --> .L3 is
>     inside while loop. if (*s++ != '\n')
>
>     ������� cmp���� x19, x20 �--> while(s <= end)
>
>     ������� bhi���� .L2���� --> .L2 is
>     label ret1.
>
>     �
>
>     ������� ldrb��� w3, [x19], 1� --> start of
>     remainder
>
>     ������� cmp���� w3, 10
>
>     ������� beq���� .L71
>
>     ������� cmp���� x2, 1
>
>     ������� beq���� .L3
>
>     ������� cmp���� x2, 2
>
>     ������� beq���� .L49
>
>     ������� cmp���� x2, 3
>
>     ������� beq���� .L50
>
>     ������� cmp���� x2, 4
>
>     ������� beq���� .L51
>
>     ������� cmp���� x2, 5
>
>     ������� beq���� .L52
>
>     ������� cmp���� x2, 6
>
>     ������� beq���� .L53
>
>     ������� ldrb��� w4, [x19], 1
>
>     ������� cmp���� w4, 10
>
>     ������� bne���� .L53
>
>     .L71:
>
>     ������� cbz���� x21, .L4� --> if(check1 ||
>     check2)
>
>     ������� bl����� foo()
>
>     ������� mov���� x19, x0
>
>     ������� cbz���� x0, .L2�� --> if (!s)
>     goto ret1;
>
>     �
>
>     .L4:���������������������
>     --> if(boo(s)) goto ret0;
>
>     ������� mov���� x0, x19
>
>     ����� ��bl����� boo(char*)
>
>     ������� cbz���� w0, .L70
>
>     ������� mov���� w0, 0
>
>     ������� ldp���� x19, x20, [sp, 16]
>
>     ������� ldr���� x21, [sp, 32]
>
>     ������� ldp���� x29, x30, [sp], 48
>
>     ������� ret
>
>     .L53:��������� --> if (*s++ != '\n') for remainder
>
>     ������� ldrb��� w5, [x19], 1
>
>     ������� cmp���� w5, 10
>
>     ������� beq���� .L71
>
>     .L52:��������� --> if (*s++ != '\n') for remainder
>
>     ������� ldrb��� w6, [x19], 1
>
>     ������� cmp���� w6, 10
>
>     ������� beq���� .L71
>
>     .L51:��������� --> if (*s++ != '\n') for remainder
>
>     ������� ldrb��� w7, [x19], 1
>
>     ������� cmp���� w7, 10
>
>     ������� beq���� .L71
>
>     .L50:��������� --> if (*s++ != '\n') for remainder
>
>     ������� ldrb��� w8, [x19], 1
>
>     ������� cmp���� w8, 10
>
>     ������� beq���� .L71
>
>     .L49:��������� --> if (*s++ != '\n') for remainder
>
>     ������� ldrb��� w9, [x19], 1
>
>     ������� cmp���� w9, 10
>
>     ������� beq���� .L71
>
>     .L3:�������������������������
>     --> if (*s++ != '\n'), 7 times unrolled
>
>     ������� cmp���� x19, x20
>
>     ������� bhi���� .L2
>
>     ������� ldrb��� w10, [x19]
>
>     ������� add�� ��x19, x19, 1
>
>     ������� mov���� x11, x19
>
>     ������� cmp���� w10, 10
>
>     ������� beq���� .L71
>
>     ������� ldrb��� w12, [x19], 1
>
>     ������� cmp���� w12, 10
>
>     ������� beq���� .L71
>
>     ������� ldrb��� w13, [x11, 1]
>
>     ������� add���� x19, x11, 2
>
>     ������� cmp���� w13, 10
>
>     ������� beq��� �.L71
>
>     ������� ldrb��� w14, [x11, 2]
>
>     ������� add���� x19, x11, 3
>
>     ������� cmp���� w14, 10
>
>     ������� beq���� .L71
>
>     ������� ldrb��� w15, [x11, 3]
>
>     ������� add���� x19, x11, 4
>
>     ������� cmp���� w15, 10
>
>     ������� beq���� .L71
>
>     ������� ldrb��� w16, [x11, 4]
>
>     ������� add���� x19, x11, 5
>
>     ������� cmp���� w16, 10
>
>     ������� beq���� .L71
>
>     ������� ldrb��� w17, [x11, 5]
>
>     ������� add���� x19, x11, 6
>
>     ������� cmp���� w17, 10
>
>     ������� beq���� .L71
>
>     ������� ldrb��� w18, [x11, 6]
>
>     ������� add���� x19, x11, 7
>
>     ������� cmp���� w18, 10
>
>     ������� beq���� .L71
>
>     ������� b������ .L3
>
>     .L2:���������������������
>     --> label ret1
>
>     ������� mov���� w0, 1
>
>     ������� ldp���� x19, x20, [sp, 16]
>
>     ������� ldr���� x21, [sp, 32]
>
>     ������� ldp���� x29, x30, [sp], 48
>
>     ������� ret
>
>     �
>
>     I am sorry for showing you assembly output directly� It looks
>     like the rtl level�s unroll pass of gcc unrolls above loop and I
>     need to check it next week. Once I have clear idea about above
>     unrolling from gcc, let me reduce the example more and let you know.
>
>     �
>
>     Thanks
>
>     JinGu Kang
>
>     �
>
>     *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org>
>     <mailto:llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Philip
>     Reames via llvm-dev
>     *Sent:* 16 July 2021 18:27
>     *To:* Jingu Kang <Jingu.Kang at arm.com> <mailto:Jingu.Kang at arm.com>
>     *Cc:* llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     *Subject:* Re: [llvm-dev] Question about Unrolling Loop with
>     Multiple Exits
>
>     �
>
>     �
>
>     On 7/16/21 9:29 AM, Jingu Kang wrote:
>
>         Hi Philip,
>
>         Thanks for your kind reply.
>
>         > A) Are you measuring on tip of tree?� There were changes
>         for multiple exit unrolling which landed very recently.
>
>         Yep, I am investigating benchmarks with llvm tip and I can see
>         the llvm fails to unroll some loops with multiple exits.
>
>         > B) One of your exits does not dominate your latch.� Those
>         are generally hard
>
>         > C) This example does not seem to require gotos.� I
>         strongly suggest reducing your test cases if you want more
>         informed commentary.�
>
>         �
>
>         I am looking at perlbench recently and it has `goto`
>         statements inside loop. The example is a reduced case.
>
>     Right, but the gotos aren't relevant for your reduced test.� You
>     can reduce further.
>
>
>         When I look at the gcc�s output of the example, it looks
>         like gcc unrolls only the below `if` statement block�
>
>         �
>
>         ��� if (*s++ != '\n')
>
>         ����� continue;
>
>     Your phrasing here does not parse for me.� Can you restate this
>     with different wording and maybe a fully worked example?
>
>
>         �
>
>         Thanks
>
>         JinGu Kang
>
>         �
>
>         *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org>
>         <mailto:llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Philip
>         Reames via llvm-dev
>         *Sent:* 16 July 2021 15:52
>         *To:* Jingu Kang <Jingu.Kang at arm.com>
>         <mailto:Jingu.Kang at arm.com>; llvm-dev at lists.llvm.org
>         <mailto:llvm-dev at lists.llvm.org>
>         *Subject:* Re: [llvm-dev] Question about Unrolling Loop with
>         Multiple Exits
>
>         �
>
>         A) Are you measuring on tip of tree?� There were changes for
>         multiple exit unrolling which landed very recently.
>
>         B) One of your exits does not dominate your latch.� Those
>         are generally hard.�
>
>         C) This example does not seem to require gotos.� I strongly
>         suggest reducing your test cases if you want more informed
>         commentary.�
>
>         Philip
>
>         On 7/16/21 7:42 AM, Jingu Kang via llvm-dev wrote:
>
>             Hi All,
>
>             �
>
>             While I am investigating benchmarks, I have found loops
>             which llvm fails to unroll because the loops have multiple
>             exits.
>
>             For example,
>
>             �
>
>             char *foo(void);
>
>             int boo(char *s);
>
>             �
>
>             int test(char *s, char *end, char *check1, char *check2) {
>
>             � while (s <= end) {
>
>             ��� if (*s++ != '\n')
>
>             ����� continue;
>
>             ��� if (check1 || check2) {
>
>             ����� s = foo();
>
>             ����� if (!s)
>
>             ������� goto ret1;
>
>             ��� }��
>
>             ����if (boo(s))
>
>             ����� goto ret0;
>
>             � }
>
>             � goto ret1;
>
>             �
>
>             ret0:
>
>             � return 0;
>
>             ret1:
>
>             � return 1;
>
>             }
>
>             �
>
>             Above code causes below messages from LoopUnroll pass.
>
>             �
>
>             Bailout for multi-exit handling when latch exit has >1
>             predecessor.
>
>             Multiple exit/exiting blocks in loop and multi-exit
>             unrolling not enabled!
>
>             �
>
>             I can see the option `unroll-runtime-multi-exit` and
>             comments about it. I wonder there are already reviews for
>             the work on phabriactor or some people are working on it.
>
>             If someone knows information about it, please share it.
>
>             �
>
>             Thanks
>
>             JinGu Kang
>
>             �
>
>             �
>
>             �
>
>             �
>
>             �
>
>             �
>
>
>
>
>
>             _______________________________________________
>
>             LLVM Developers mailing list
>
>             llvm-dev at lists.llvm.org  <mailto:llvm-dev at lists.llvm.org>
>
>             https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev  <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210720/dc3a3a47/attachment.html>


More information about the llvm-dev mailing list