[llvm-dev] Question about Unrolling Loop with Multiple Exits
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Tue Jul 20 10:17:13 PDT 2021
./opt -loop-unroll -enable-new-pm=0 < llvm-dev.ll -S -unroll-runtime -debug
Args: ./opt -loop-unroll -enable-new-pm=0 -S -unroll-runtime -debug
Loop Unroll: F[test] Loop %while.body.us
Loop Size = 10
runtime unrolling with count: 8
Exiting block %if.end8.us: TripCount=0, TripMultiple=1, BreakoutTrip=1
Exiting block %while.cond.backedge: TripCount=0, TripMultiple=1,
BreakoutTrip=1
Trying runtime unrolling on Loop:
Loop at depth 1 containing:
%while.body.us<header>,%if.end8.us<exiting>,%while.cond.backedge<latch><exiting>
Using prolog remainder.
Bailout for multi-exit handling when latch exit has >1 predecessor.
Multiple exit/exiting blocks in loop and multi-exit unrolling not enabled!
Won't unroll; remainder loop could not be generated when assuming
runtime trip count
; ModuleID = '<stdin>'
source_filename = "<stdin>"
declare i32 @foo(i8*)
define void @test(i8* %s, i64 %a) {
entry:
%s.addr.a = getelementptr i8, i8* %s, i64 %a
br label %while.body.us
while.body.us: ; preds =
%while.cond.backedge, %entry
%s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ]
%incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1
%incdec.val = load i8, i8* %s.addr, align 1
%cmp1 = icmp eq i8 %incdec.val, 10
br i1 %cmp1, label %if.end8.us, label %while.cond.backedge
if.end8.us: ; preds = %while.body.us
%call9 = tail call i32 @foo(i8* nonnull %incdec.ptr)
%cmp2 = icmp ult i32 %call9, 0
br i1 %cmp2, label %while.cond.backedge, label %return.loopexit
while.cond.backedge: ; preds = %if.end8.us,
%while.body.us
%cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a
br i1 %cmp3, label %return.loopexit, label %while.body.us
return.loopexit: ; preds =
%while.cond.backedge, %if.end8.us
ret void
}
The debug output appears to give a pretty good clue as to where you
could start if desired.
Philip
On 7/20/21 9:45 AM, Jingu Kang wrote:
>
> Hi Philip,
>
> I have reduced the test case roughly as below.
>
> declare i32 @foo(i8 *)
>
> define void @test(i8* %s, i64 %a) {
>
> entry:
>
> %s.addr.a = getelementptr i8, i8* %s, i64 %a
>
> br label %while.body.us
>
> while.body.us:
>
> %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ]
>
> %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1
>
> %incdec.val = load i8, i8* %s.addr, align 1
>
> %cmp1 = icmp eq i8 %incdec.val, 10
>
> br i1 %cmp1, label %if.end8.us, label %while.cond.backedge
>
> if.end8.us:
>
> %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr)
>
> %cmp2 = icmp ult i32 %call9, 0
>
> br i1 %cmp2, label %while.cond.backedge, label %return.loopexit
>
> while.cond.backedge:
>
> %cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a
>
> br i1 %cmp3, label %return.loopexit, label %while.body.us
>
> return.loopexit:
>
> ret void
>
> }
>
> Roughly, I unrolled the loop manually 2 times as below and ignored the
> remaining loop simply.
>
> define void @test(i8* %s, i64 %a) {
>
> entry:
>
> %s.addr.a = getelementptr i8, i8* %s, i64 %a
>
> br label %while.body.us
>
> while.body.us:
>
> %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ]
>
> %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1
>
> %incdec.val = load i8, i8* %s.addr, align 1
>
> %cmp1 = icmp eq i8 %incdec.val, 10
>
> br i1 %cmp1, label %if.end8.us, label %while.body.us.1
>
> while.body.us.1:
>
> %incdec.ptr.1 = getelementptr inbounds i8, i8* %incdec.ptr, i64 1
>
> %incdec.val.1 = load i8, i8* %incdec.ptr, align 1
>
> %cmp1.1 = icmp eq i8 %incdec.val.1, 10
>
> br i1 %cmp1.1, label %if.end8.us, label %while.cond.backedge
>
> if.end8.us:
>
> %incdec.ptr.phi = phi i8* [ %incdec.ptr, %while.body.us ], [
> %incdec.ptr.1, %while.body.us.1 ]
>
> %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr.phi)
>
> %cmp2 = icmp ult i32 %call9, 0
>
> br i1 %cmp2, label %while.cond.backedge, label %return.loopexit
>
> while.cond.backedge:
>
> %cmp3 = icmp eq i8* %incdec.ptr.1, %s.addr.a
>
> br i1 %cmp3, label %return.loopexit, label %while.body.us
>
> return.loopexit:
>
> ret void
>
> }
>
> If possible, can we make loop unroll pass handle this kind of cases
> please?
>
> If I missed something, please let me know.
>
> Thanks
>
> JinGu Kang
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of
> *Philip Reames via llvm-dev
> *Sent:* 16 July 2021 20:12
> *To:* Jingu Kang <Jingu.Kang at arm.com>
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] Question about Unrolling Loop with Multiple
> Exits
>
> Jingu,
>
> I'm not fluent in AArch64 assembly, sorry.� If you want more
> meaningful commentary, you'll need to reduce your c++ example further,
> and give a better explanation of what you expect the unroller to do
> for you.�
>
> Philip
>
> On 7/16/21 12:09 PM, Jingu Kang wrote:
>
> Sorry for poor example�
>
> �
>
> The AArch64 assembly output of the example from gcc is as below.
> The loop is unrolled 7 times. I have written some comments to
> explain how the assembly code is mapped to C source code. As you
> can see on `.L3` label, the �if (*s++ != '\n')� block is
> unrolled 7 times.
>
> �
>
> ������� stp���� x29, x30, [sp, -48]!
>
> ������� mov���� x29, sp
>
> ������� stp���� x19, x20, [sp, 16]
>
> ������� mov���� x19, x0� --> x19 is char *s
>
> ������� mov���� x20, x1� --> x20 is char *end
>
> ������� str���� x21, [sp, 32]
>
> ������� orr���� x21, x3, x2� --> x21 is
> check1 | check2
>
> .L70:
>
> ������� sub���� x0, x20, x19� --> x0 = end
> - s;
>
> ������� add���� x1, x0, 1
>
> ������� ands��� x2, x1, 7� --> unroll count is 7
>
> ������� beq���� .L3���� --> .L3 is
> inside while loop. if (*s++ != '\n')
>
> ������� cmp���� x19, x20 �--> while(s <= end)
>
> ������� bhi���� .L2���� --> .L2 is
> label ret1.
>
> �
>
> ������� ldrb��� w3, [x19], 1� --> start of
> remainder
>
> ������� cmp���� w3, 10
>
> ������� beq���� .L71
>
> ������� cmp���� x2, 1
>
> ������� beq���� .L3
>
> ������� cmp���� x2, 2
>
> ������� beq���� .L49
>
> ������� cmp���� x2, 3
>
> ������� beq���� .L50
>
> ������� cmp���� x2, 4
>
> ������� beq���� .L51
>
> ������� cmp���� x2, 5
>
> ������� beq���� .L52
>
> ������� cmp���� x2, 6
>
> ������� beq���� .L53
>
> ������� ldrb��� w4, [x19], 1
>
> ������� cmp���� w4, 10
>
> ������� bne���� .L53
>
> .L71:
>
> ������� cbz���� x21, .L4� --> if(check1 ||
> check2)
>
> ������� bl����� foo()
>
> ������� mov���� x19, x0
>
> ������� cbz���� x0, .L2�� --> if (!s)
> goto ret1;
>
> �
>
> .L4:���������������������
> --> if(boo(s)) goto ret0;
>
> ������� mov���� x0, x19
>
> ����� ��bl����� boo(char*)
>
> ������� cbz���� w0, .L70
>
> ������� mov���� w0, 0
>
> ������� ldp���� x19, x20, [sp, 16]
>
> ������� ldr���� x21, [sp, 32]
>
> ������� ldp���� x29, x30, [sp], 48
>
> ������� ret
>
> .L53:��������� --> if (*s++ != '\n') for remainder
>
> ������� ldrb��� w5, [x19], 1
>
> ������� cmp���� w5, 10
>
> ������� beq���� .L71
>
> .L52:��������� --> if (*s++ != '\n') for remainder
>
> ������� ldrb��� w6, [x19], 1
>
> ������� cmp���� w6, 10
>
> ������� beq���� .L71
>
> .L51:��������� --> if (*s++ != '\n') for remainder
>
> ������� ldrb��� w7, [x19], 1
>
> ������� cmp���� w7, 10
>
> ������� beq���� .L71
>
> .L50:��������� --> if (*s++ != '\n') for remainder
>
> ������� ldrb��� w8, [x19], 1
>
> ������� cmp���� w8, 10
>
> ������� beq���� .L71
>
> .L49:��������� --> if (*s++ != '\n') for remainder
>
> ������� ldrb��� w9, [x19], 1
>
> ������� cmp���� w9, 10
>
> ������� beq���� .L71
>
> .L3:�������������������������
> --> if (*s++ != '\n'), 7 times unrolled
>
> ������� cmp���� x19, x20
>
> ������� bhi���� .L2
>
> ������� ldrb��� w10, [x19]
>
> ������� add�� ��x19, x19, 1
>
> ������� mov���� x11, x19
>
> ������� cmp���� w10, 10
>
> ������� beq���� .L71
>
> ������� ldrb��� w12, [x19], 1
>
> ������� cmp���� w12, 10
>
> ������� beq���� .L71
>
> ������� ldrb��� w13, [x11, 1]
>
> ������� add���� x19, x11, 2
>
> ������� cmp���� w13, 10
>
> ������� beq��� �.L71
>
> ������� ldrb��� w14, [x11, 2]
>
> ������� add���� x19, x11, 3
>
> ������� cmp���� w14, 10
>
> ������� beq���� .L71
>
> ������� ldrb��� w15, [x11, 3]
>
> ������� add���� x19, x11, 4
>
> ������� cmp���� w15, 10
>
> ������� beq���� .L71
>
> ������� ldrb��� w16, [x11, 4]
>
> ������� add���� x19, x11, 5
>
> ������� cmp���� w16, 10
>
> ������� beq���� .L71
>
> ������� ldrb��� w17, [x11, 5]
>
> ������� add���� x19, x11, 6
>
> ������� cmp���� w17, 10
>
> ������� beq���� .L71
>
> ������� ldrb��� w18, [x11, 6]
>
> ������� add���� x19, x11, 7
>
> ������� cmp���� w18, 10
>
> ������� beq���� .L71
>
> ������� b������ .L3
>
> .L2:���������������������
> --> label ret1
>
> ������� mov���� w0, 1
>
> ������� ldp���� x19, x20, [sp, 16]
>
> ������� ldr���� x21, [sp, 32]
>
> ������� ldp���� x29, x30, [sp], 48
>
> ������� ret
>
> �
>
> I am sorry for showing you assembly output directly� It looks
> like the rtl level�s unroll pass of gcc unrolls above loop and I
> need to check it next week. Once I have clear idea about above
> unrolling from gcc, let me reduce the example more and let you know.
>
> �
>
> Thanks
>
> JinGu Kang
>
> �
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org>
> <mailto:llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Philip
> Reames via llvm-dev
> *Sent:* 16 July 2021 18:27
> *To:* Jingu Kang <Jingu.Kang at arm.com> <mailto:Jingu.Kang at arm.com>
> *Cc:* llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] Question about Unrolling Loop with
> Multiple Exits
>
> �
>
> �
>
> On 7/16/21 9:29 AM, Jingu Kang wrote:
>
> Hi Philip,
>
> Thanks for your kind reply.
>
> > A) Are you measuring on tip of tree?� There were changes
> for multiple exit unrolling which landed very recently.
>
> Yep, I am investigating benchmarks with llvm tip and I can see
> the llvm fails to unroll some loops with multiple exits.
>
> > B) One of your exits does not dominate your latch.� Those
> are generally hard
>
> > C) This example does not seem to require gotos.� I
> strongly suggest reducing your test cases if you want more
> informed commentary.�
>
> �
>
> I am looking at perlbench recently and it has `goto`
> statements inside loop. The example is a reduced case.
>
> Right, but the gotos aren't relevant for your reduced test.� You
> can reduce further.
>
>
> When I look at the gcc�s output of the example, it looks
> like gcc unrolls only the below `if` statement block�
>
> �
>
> ��� if (*s++ != '\n')
>
> ����� continue;
>
> Your phrasing here does not parse for me.� Can you restate this
> with different wording and maybe a fully worked example?
>
>
> �
>
> Thanks
>
> JinGu Kang
>
> �
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org>
> <mailto:llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Philip
> Reames via llvm-dev
> *Sent:* 16 July 2021 15:52
> *To:* Jingu Kang <Jingu.Kang at arm.com>
> <mailto:Jingu.Kang at arm.com>; llvm-dev at lists.llvm.org
> <mailto:llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] Question about Unrolling Loop with
> Multiple Exits
>
> �
>
> A) Are you measuring on tip of tree?� There were changes for
> multiple exit unrolling which landed very recently.
>
> B) One of your exits does not dominate your latch.� Those
> are generally hard.�
>
> C) This example does not seem to require gotos.� I strongly
> suggest reducing your test cases if you want more informed
> commentary.�
>
> Philip
>
> On 7/16/21 7:42 AM, Jingu Kang via llvm-dev wrote:
>
> Hi All,
>
> �
>
> While I am investigating benchmarks, I have found loops
> which llvm fails to unroll because the loops have multiple
> exits.
>
> For example,
>
> �
>
> char *foo(void);
>
> int boo(char *s);
>
> �
>
> int test(char *s, char *end, char *check1, char *check2) {
>
> � while (s <= end) {
>
> ��� if (*s++ != '\n')
>
> ����� continue;
>
> ��� if (check1 || check2) {
>
> ����� s = foo();
>
> ����� if (!s)
>
> ������� goto ret1;
>
> ��� }��
>
> ����if (boo(s))
>
> ����� goto ret0;
>
> � }
>
> � goto ret1;
>
> �
>
> ret0:
>
> � return 0;
>
> ret1:
>
> � return 1;
>
> }
>
> �
>
> Above code causes below messages from LoopUnroll pass.
>
> �
>
> Bailout for multi-exit handling when latch exit has >1
> predecessor.
>
> Multiple exit/exiting blocks in loop and multi-exit
> unrolling not enabled!
>
> �
>
> I can see the option `unroll-runtime-multi-exit` and
> comments about it. I wonder there are already reviews for
> the work on phabriactor or some people are working on it.
>
> If someone knows information about it, please share it.
>
> �
>
> Thanks
>
> JinGu Kang
>
> �
>
> �
>
> �
>
> �
>
> �
>
> �
>
>
>
>
>
> _______________________________________________
>
> LLVM Developers mailing list
>
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210720/dc3a3a47/attachment.html>
More information about the llvm-dev
mailing list