[PATCH] D68205: [ModuloSchedule] Peel out prologs and epilogs, generate actual code
Thomas Raoux via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 1 09:37:51 PDT 2019
ThomasRaoux accepted this revision.
ThomasRaoux added a comment.
This revision is now accepted and ready to land.
In D68205#1689574 <https://reviews.llvm.org/D68205#1689574>, @jmolloy wrote:
> Hi Thomas,
>
> I made an example to show how this is handled.
>
> %11:intregs = S2_addasl_rrri %7, %6, 1, post-instr-symbol <mcsymbol Stage-0_Cycle-0>
> %12:intregs = L2_loadruh_io %11, -4, post-instr-symbol <mcsymbol Stage-1_Cycle-0> :: (load 2 from %ir.cgep2, !tbaa !0)
> %5:intregs = S2_storerh_pi %6, -2, %12, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
> ENDLOOP0 %bb.3, implicit-def $pc, implicit-def $lc0, implicit $sa0, implicit $lc0
>
>
> We generate this code, annotated:
>
> <... prolog, boring ...>
>
> bb.3.b2 (address-taken): // Kernel.
> successors: %bb.3(0x7c000000), %bb.10(0x04000000)
>
> %15:intregs = PHI %25, %bb.6, %11, %bb.3
> %17:intregs = PHI %26, %bb.6, %12, %bb.3
> %11:intregs = S2_addasl_rrri %7, %6, 1, post-instr-symbol <mcsymbol Stage-0_Cycle-0>
> %12:intregs = L2_loadruh_io %15, -4, post-instr-symbol <mcsymbol Stage-1_Cycle-0> :: (load 2 from %ir.cgep2, !tbaa !0)
> dead %5:intregs = S2_storerh_pi %6, -2, %17, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
> ENDLOOP0 %bb.3, implicit-def $pc, implicit-def $lc0, implicit $sa0, implicit $lc0
> J2_jump %bb.10, implicit-def $pc
>
> bb.10.b2: // Epilog 0, runs stage 2
> successors: %bb.9(0x80000000)
>
> %40:intregs = PHI %11, %bb.3, %25, %bb.6
> %41:intregs = PHI %12, %bb.3, %26, %bb.6
> dead %44:intregs = S2_storerh_pi %6, -2, %41, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
> J2_jump %bb.9, implicit-def $pc
>
> bb.9.b2: // Start of Epilog 1, runs stage 1
> successors: %bb.8(0x80000000)
>
> %35:intregs = PHI %40, %bb.10, %20, %bb.5
> %38:intregs = L2_loadruh_io %35, -4, post-instr-symbol <mcsymbol Stage-1_Cycle-0> :: (load 2 from %ir.cgep2, !tbaa !0)
> J2_jump %bb.8, implicit-def $pc
>
> bb.8.b2: // Next stage of Epilog 1, runs stage 2
> successors: %bb.7(0x80000000)
>
> dead %34:intregs = S2_storerh_pi %6, -2, %38, post-instr-symbol <mcsymbol Stage-2_Cycle-0> :: (store 2 into %ir.lsr.iv, !tbaa !0)
> J2_jump %bb.7, implicit-def $pc
>
>
> The key is that though E1 runs stages {1,2}, we *don't* create a block with both stages {1,2} enabled. This would cause the invalid code issue you mentioned. Instead, we expand this into *two* blocks. The first performs stage 1, the second stage 2 which consumes its input from stage 1.
>
> That means we do generate superfluous epilog blocks, but these get merged by the control flow optimizer later.
>
> That said, I'm not guaranteeing there are no bugs here. Perhaps the testcase you're thinking of distills to something more complex than my testcase?
>
> Cheers,
>
> James
Thanks for explaining. That makes sense, I missed that we peel the two phases in two steps. I wonder if there is a way to express it easily in the comment. Anyway it looks like it works, I think the case I saw was the same pattern so it should work.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D68205/new/
https://reviews.llvm.org/D68205
More information about the llvm-commits
mailing list